If Radek Osmulski can use artificial intelligence to figure out a way to translate whale song, he probably won’t have too much trouble improving mine-equipment maintenance with data science.
The point being, a forum on Machine Learning in Mining in Perth, Western Australia, heard this week, that Osmulski is among a fast-growing “globally distributed network” of skilled data scientist freelancers who might be drawn to a mining engineering or exploration challenge, or even help a grain growers’ co-operative find a better way to classify grain with data (which he did with CBH Group in Australia).
“When he’s not classifying images of grain, Radek is using his machine learning skills to try to translate whale song,” Unearthed managing director Justin Strharsky said at the conference.
“In the very near future, because of the efforts of people in this globally distributed network with machine learning skills, we may be able to understand what whales are saying to each other. How crazy is that?
“Predicting machine failure should be easy next to that.”
Shravan Kumar Koninti, who works on drug discovery for Swiss company Novartis, and Chevron AI expert Ramdhan Ari Wibawa, are two other data scientists said to be “representative of what that distribution of digital skills internationally looks like”.
Strharsky, who co-founded Unearthed in Australia nearly nine years ago, has built a network of hundreds of talented information technology, engineering and geoscience “innovators” in more than 40 countries. They compete to win competitions orchestrated by Unearthed and sponsor mining, power, manufacturing and other companies. Various groups are said to have addressed more than 200 challenges from energy and resources companies, in the process devising prototype technologies and algorithms that can find their way into the broader market.
Central to the engagement of outside experts to address mining and other technical problems is the provision of data by the sponsoring companies – sometimes proprietary data – for analysis, modelling and solution building.
Strharsky told the Machine Learning in Mining conference reticence on the part of mining companies – indeed companies in various industries – to share data needed to be overcome if they were going to tap valuable data science skills and expertise in the burgeoning global “creator” and “passion” economy.
The conference later heard from Rob Johnston of CITIC Pacific Mining – one of about 100 companies involved in a Global Mining Guidelines Group project aimed at establishing guidelines for “open datasets for AI in mining” – who said the recommendations were an important step to bringing the industry into line with more advanced data-sharing industries such as IT and even oil and gas. The guidelines are due out this month.
Strharsky said a cursory look at current unfilled data science jobs advertised in the US and Australia alone showed more than 86,000 positions. Meanwhile, an online machine learning course run by renowned Stanford University professor and former head of Google Brain, Andrew Ng, had turned out 4.5 million graduates.
“We’ve got tonnes of unfilled roles in data science. If we’re here to talk about ML in industry, that starts with data,” Strharsky said.
“Anybody tried to hire a data scientist lately? It’s difficult at the moment. Part of that is because of this, what’s been called, the great resignation … that distribution of digital skills is affecting all of our businesses and will continue to do so. How do we turn that to our advantage?
“My contention is that sharing your data is the first and single most important step in the journey to being able to leverage that global distribution of skills.”
There were at least five good reasons to “share your data more”, said Strharsky, adding a number of miners were already onboard with the concept. Australian gold major Newcrest Mining and mid-tier copper producer OZ Minerals were among the companies that had shared operational data with groups of data scientists assembled by Unearthed.
“I know that each of you in your own organisations have very talented people with serious responsibilities around data security, or IP and the like, and they’re going to have lots of questions. Hopefully these five reasons will arm you with enough important considerations to start a conversation in the business about why you should consider sharing your data externally.”
Strharsky said first, it was vital companies didn’t undervalue data assets. Unearthed had helped companies prepared to share data to better understand and leverage value in sometimes concealed or siloed data.
Secondly, sharing internal data helped build “data security muscles”.
“It might seem completely counterintuitive but I’m here to tell you that I think sharing your data more is actually the best way to improve your data security practices,” Strharsky said.
“How else do you stress-test the very control systems you put in place for protecting your data? You have to go through the motions of sharing it with others to even build that muscle in your organisation [and] the muscles for doing things like exporting data, thinking about which data is sensitive and which data is not, figuring out how to do things like obfuscating or removing identification from data; things you can get better at with practice.
“I would hazard a guess that almost all of our organisations are [already] sharing data in some form and that we have some established controls, whether contractual or otherwise, for sharing that data. And that’s just going to increase. I don’t see a world in which we stop sharing data at all.
“The third reason you should share your data externally more is to attract talent.
“Newcrest was interested in predicting the density of tailings underflow in advance, to reduce the use of water in their operations. We got hundreds of our community involved in building the best ML models for doing just that. The outcome was many different models that were successful at doing that and an ability to predict tailings underflow more than three hours in advance. That’s enabled Newcrest to make a significant contribution to the bottom line of their organisation and to save many gigalitres of water in the process.
“But critically in doing so they were able to assess the skills of the people who participated in this competition in very real ways. Those people were applying ML skills to a real industrial problem. It’s actually quite a challenge if you are building data science skills on the internet to get useful data.
“For Newcrest it meant seeing how hundreds of people performed at building models for predicting tailings underflow density. And after this engagement they hired two of the top teams to continue development of some of those solutions in practice.”
A fourth key reason for opening internal data to external parties to try to resolve operating and other challenges was to “stand out from the crowd”.
Strharsky said OZ Minerals’ 2018-19 exploration challenge, to try to introduce some “new” thinking into its South Australia copper-gold exploration, drew more than 1000 people from around the world to work on more than two terabytes of proprietary exploration data. “OZ Minerals drew a line in the sand and said, we’re going to be a modern mining company and that means using our data differently to do exploration. They thought practically about that: they owned the tenement for which the data mattered so they understood they had the only rights to do anything with the insights on the back of that data. But this did build a muscle for OZ as well as building their modern mining company brand.”
OZ is one of the few mining companies to embark on these types of programs worldwide, which might mean external views on value are not aligned with those inside the sponsoring entities.
Strharsky said: “I don’t think that the perception of value matches the actual value. This is compounded by the fact that it takes an organisation, a culture, a long time to assimilate the lessons from something like this. There a few bits of value that are obvious immediately – others take time and require cultural change.”
Reason number five, he said, was to “leverage Joy’s law”.
For the uninitiated, Strharsky said one of his first jobs, in Silicon Valley, California, was with the Bill Joy co-founded Sun Microsystems – ultimately bought by Oracle for US$5.6 billion in 2010. “Bill Joy … was famous for saying that it doesn’t matter the size of your company – how big or small your company is – most of the world’s smartest people are outside of it. He wasn’t talking about the competitive landscape, he was talking about the solution. For all of our organisations this is true. I don’t care how many smart people you have in your business, most of the world’s smart people are outside of it.”
Strharsky said Radek Osmulski was one of at least 265 data scientists from 42 countries who vied to build the best ML model for automating CBH Group’s grain-sample grading process, which saw more than 600 models developed. “This is how you leverage Joy’s law and resolve the paradox of unfilled data science roles [next to] this incredible abundance of people with ML skills,” he said.
“We have seen in mining and several adjacent industries how sharing data has enabled companies to attract top talent, build new data security muscles, and leverage Joy’s law to build a resilient organisation that taps into the potential of this creator economy.”
CITIC’s Johnston said the company got involved in the GMG project because the potential benefits of “novel ideas” and even out-of-the-box thinking about common industry practices and problems was evident.
“We’re looking at a lot of machine health data at the moment,” he said. “We’ve saved money by having engineers looking at dashboards; if we can automate that we’re going to save more money. We can quite literally save millions.”