Month: August 2020

Weekly Digest, August 31

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. To subscribe, follow this link.  

Featured Resources and Technical Contributions 

Featured Articles

Picture of the Week

Source: article flagged with a + 

To make sure you keep getting these emails, please add to your address book or whitelist us. To subscribe, click here. Follow us: Twitter | Facebook.

Dynamism of IoT-powered Gas Monitoring Solution

IoT is a stay-ahead technology that is providing an effective paradigm shift in multiple industries, including the oil and gas sector. With the help of real-time monitoring and advanced analytics features, this disruptive technology is allowing the O&G companies to effectively monitor their assets ranging from pipeline networks to machines and pumping equipment.

Currently, technology investment in the gas sector completely depends on mobility, asset management, cloud adoption, and analytics. It is further targeted to reduced infrastructure costs which are greatly justified with the usage of IoT. The Internet of Things is more focused on generating data efficiencies and more informed decision-making.

Along with improving operational and asset efficiency, IoT holds a great potential to increase output in some cases. For instance, Intel has claimed that IoT-powered infrastructure and data analytics procedures used in gas wells can provide improved results by 30%. Also, McKinsey estimated that offshore platforms are performing at 77% of maximum production levels.

Hence, if applied accurately, IoT and advanced analytics tools allow the gas companies to increase their potential in collecting massive data and gain valuable insights for the future. Moreover, it can generate a greater ROI of up to 30 to 50 times the actual investment in a very short deployment period.

The Need for An IoT-powered Gas Monitoring Solution

A large variety of applications and processes in the gas industry use highly dangerous flammable and toxic gases. The inevitable occasional escape of such gases creates a hazardous environment for each worker and even the nearby residents. This results in devastating incidents involving asphyxiation, unwanted explosions, and life losses.

In most industries, one of the safety plans to avoid such situations is to implement a gas monitoring solution as an early warning system where it is easier and safer to pre-detect the toxicity in the atmosphere.

It Serves Multiple Areas

Apart from huge oil and gas refineries and plants, a gas monitoring solution can also be used in commercial areas like parking lots, laboratories, sewage treatment, hospitals, and swimming pools. Such areas are prone to high-risk due to the vigorous machineries and plant operations. A slight gas leak could lead to major havoc and cost human lives when not taken the necessary action. Let’s find out how this system works.


  • Real-time Monitoring
  • End-to-end Solution
  • Control Visualization
  • Immediate Alerts

These are the key factors on which a gas monitoring solution relies. The sensors installed on the assets allow you to sense the presence of gas concentrations in the atmosphere at all times and from any location. This is a completely tailored solution that is enabled with all the necessary hardware and software capabilities to simplify your purchases. Moreover, it is equipped with advanced analytics to optimize and keep control of industrial operations. Along with full control, this solution enables remote monitoring to help identify the presence of toxic gases in remotely located infrastructures. After detecting the presence of gases, it immediately alarms the authorities to stay prepared with necessary measures.

A gas monitoring solution comprises of the latest sensor devices that automatically fetch data from the assets and allow the authorities to predict necessary results for improved facility maintenance. A centralized dashboard is a versatile management desk for the managers to not only supervise environmental conditions but also perform administrative and managerial tasks for the plant facility.

Here are some of the benefits that you can avail by using a gas monitoring solution at your premises.

Benefits of Implementing IoT in the Gas Industry

Remote Monitoring

                How convenient is it to be able to monitor the industrial facility from anywhere and at any time? This is what IoT provides to the industries. It allows effective remote monitoring of the industrial equipment or assets through sensor devices. In remotely located oil and gas infrastructure, toxic gases are produced in a high ratio. The managers use remote monitoring to find out the presence of these gases in those infrastructures and fetch real-time data to take immediate actions in case there are hints of any mishap occurring. Though it is riskier for the plant authorities to manually go and supervise the infrastructure, they use remote monitoring through IoT as a significant alternative to avoid the chances of gas explosions. This, in turn, also saves a huge amount of money spent in constructing the entire framework for oil and gas production.

Predictive/Preventive Maintenance

                Many O&G companies require real-time monitoring of the machinery to keep a regular check on their condition and performance. Also, detecting harmful and toxic gases within the industrial facility is of major concern to avoid disasters. This is significantly possible with the help of an IoT-powered gas monitoring solution that offers remote services to the authorities. This allows the facilities to react immediately via predictive maintenance. The solution utilizes advanced capabilities of the sensors which are installed on the industrial equipment to identify the presence of harmful gases. With the help of sensors, a quick alert is generated that helps the workers and any other authorities evacuate the premises in case of any disaster. Also, these sensors send valuable data on a cloud-based platform for the authorities to predict future scenarios and avoid chaotic situations. Using predictive maintenance, the managers can analyze different scenarios and bring forth necessary outcomes to work upon. 

Asset Management

                An IoT-powered gas monitoring solution leverages the use of analytics and wireless connectivity to offer an improved and consolidated asset management process. It is an end-to-end solution that provides remote monitoring of the industrial assets to avoid explosive disasters within the facilities. Apart from smart asset management it also provides useful insights to keep the equipment up-to-date. Moreover, a gas monitoring solution helps the plant authorities to operate the assets with the help of predictive maintenance, making the automated workflows more intelligent. It helps provide real-time visibility of the assets and their performance history for better analysis of their working conditions.

Implementing an IoT-powered solution to detect harmful gases is the most productive asset to your industry. Hence, IoT is giving us major goals to implement technology for our safety purposes. Its smart techniques and innovative concepts not only help you live in safer surroundings but also provide automation to resolve task complexities. Who would have known that it’ll be possible to measure accurate gas concentrations in the air even in huge refineries where the risk rate is 99.99%? But IoT is significantly making it possible and enabling smart plant management with reduced life risk.

Deciding for the best infraestructure

Hi everyone,

I want to create a data repository, but I am having trouble deciding for the best approach.

What I need:

  • a repository to import several datasets from different institutions. (all summed up around 750gb-1tb);
  • be able to clean these datasets and expand the columns (feature creation and so on) – creating new views/tables;
  • be able to apply ML on the (python or R)
  • apply simple mechanics of data governance (giving domain names to columns, lineage and nothing more, I believe)

what i would want if possible:

have a method for data not leaving the institutions. being on a landing zone type of schema and i would query on demand (with logging and access);

NOTE: data is all in relational data, there is no streaming, only a batch upload (first as only, or maybe a few more updates along the way)

What is the best arch for me? a simples RDBMS ? hadoop with hive and so on? dremio? denodo? delta lake?

Thank you!

submitted by /u/joofio
[link] [comments]

Matplot++: A C++ Graphics Library for Data Visualization

Data visualization can help programmers and scientists identify trends in their data and efficiently communicate these results with their peers. Modern C++ is being used for a variety of scientific applications, and this environment can benefit considerably from graphics libraries that attend the typical design goals toward scientific data visualization. Besides the option of exporting results to other environments, the customary alternatives in C++ are either non-dedicated libraries that depend on existing user interfaces or bindings to other languages. Matplot++ is a graphics library for data visualization that provides interactive plotting, means for exporting plots in high-quality formats for scientific publications, a compact syntax consistent with similar libraries, dozens of plot categories with specialized algorithms, multiple coding styles, and supports generic backends.

submitted by /u/FreitasAlan
[link] [comments]

Distribued deep learning framework

Hi everyone!

I’ve been using tensorflow For an image classification problem for now am working with 8400 images but in the near future it may get multiplied 10x that why I wondering if the is any mature framework for distributed deep learning (multiple nodes rather than multi gpu) I think BigDL only work on Xeon cpu which Isn’t idéal for me. Do you know if spark 3.0 brought new features that could help me in regards of this problem ?


submitted by /u/Aliph0
[link] [comments]

Reshaping Healthcare with Data Analytics & Business Intelligence

Big data analytics in the healthcare industry today is evolving into a promising field for delivering real-time insights from very large data sets. Plus, it also helps improve outcomes while reducing costs.

The trending digitization is leveraging the potential of data analytics and business intelligence.

According to Industry experts, the AI health market is to reach $6.6 billion by 2021, and by 2026 can potentially save the U.S. healthcare economy $150 billion in annual savings.

Today, the changing landscape of healthcare is creating a huge demand for health data analytics. Modern and cutting-edge data analytics are used to improves patient care in the healthcare system. Analyzing the available data with the best modern practices helps cut costs and also improves the health of the people in a faster way.

Data Analytics- A Brief Intro

Data Analytics is a process of collecting, inspecting, transforming, and analyzing data to generate real-time insights that can help in making crucial decisions faster.

In Gartner terms, ”Big Data is defined as volume, high-velocity and high-variety information assets that require cost-effective, innovative forms of information processing for enhanced insight and decision making”.

Advantages of Data Analytics in Healthcare

Let’s check out some of the key advantages of the Big Data Analytics in Healthcare industry:


The monitoring of vital signs is important to ensure a proactive approach to a person’s healthy state. For instance, diabetic patients can track their next insulin dosages, upcoming medical appointments, and so on.

Cost reduction 

Big Data facilitates managing crucial information and uses it to drive cost improvements. With real-insights assistance, health care organizations can track areas where cost can be minimized such as diagnostic tests, admission rates, etc.


Error minimization and precise treatments

Big Data helps healthcare organizations to provide accurate and personalized care treatment. With real-time insights, it is easier to get a fast response to a particular treatment.


Preventive care 

Big data provides preventive care services to enhance the prevention of medical risks and work more efficiently in taking care of the patients.


Streamline hospital operations

 With big data analytics, the data is generated at fast speed and helps in easily managing the operational aspects of the organization. It helps to streamline hospital operations and also tracks staffing metrics.


Big Data role in Healthcare

The potential of Big Data in healthcare relies on the ability to detect patterns and to turn high volumes of data into actionable knowledge for making crucial decisions in a patient’s health. Big data analytics upgrade efficiency in development operations for smart healthcare providers by delivering real-time updates.

The present scenario utilizes real-time dashboards to facilitate businesses to operate seamlessly. Analytics solutions not just only focus on improving a patient’s life but also help enhance stakeholder value, and boost revenues. It helps healthcare organizations with real insights that impact patients’ health.


Healthcare Business Intelligence

Healthcare Business Intelligence is the digital process through which bulk data from the healthcare industry can be collected, refined, and analyzed into real-time insights. The 4 key healthcare areas where business intelligence can be used are- costs, pharmaceuticals, clinical data, and patient behavior.

The health care organizations used business intelligence to store data in a centralized data warehouse, the security of patient’s data, accurate data analysis, and share digital reports to all departments.


Healthcare providers get real actionable insights to reduce cost, boost sales, and improve patient safety with regulations by integrating Business Intelligence. 


According to a report from the McKinsey Global Institute, applying big data to predict U.S. healthcare needs and enhance efficiency and quality could save between $300 and $450 a billion annually.


Benefits Of Business Intelligence in Healthcare

Business Intelligence is a big spectrum and delivers enormous benefits to the Healthcare industry. Let’s have a look into it.

Financial assistance

Automated database systems and intelligent data alerts facilitate maximum transparency in the finance department. Health organizations should implement analytics solutions to address operational, financial, and patient care related activities.

Evaluating performance

Healthcare business intelligence software can easily track healthcare organization activities and create an analysis based on real-time. Specific actions can be taken after collecting data that can enable reduced costs.

Patient satisfaction

By integrating the right analytics software facilitates administrators to easily handle critical tasks, and also keep track of patients’ updated data. 

Coordinating communication

Healthcare analytics software can be used to access patients’ data, current progress, and review the patient’s medical history at any time and any place. The healthcare staff gets updated information faster with real-time data. The medical cases are enhanced and improved by rapidly addressing crucial patient data.

Managing reputation

The healthcare industry today requires agile, fast, interactive software to maximize data values and support decision making. Business Intelligence in healthcare permits organizations to build a reputation around the patient, clinical care, and also drive collaboration through all departments.

Predicting the future

Advanced analytics helps healthcare professionals to have the power to predict certain critical conditions about the future. It helps in taking proactive steps by providing the best care for the patients, and also delivers high-quality treatment.

According to a report, the overall market share of business intelligence in healthcare is to increase by about  17.4% from $3.75 billion in 2017 to $15.88 billion by 2026.

 Key Takeaways

  • The incredible technologies avoid excessive waiting times in healthcare organizations and also improve patient-doctor relationships
  •  It optimizes patient care and revenue streams
  • Take advantage of real-time data for real-life situations
  • Revolutionizing the healthcare industry with the best tools and technologies that affect profits
  • Balancing and recognizing patient-doctors needs with real-time insights
  • It uncovers profound insights and improves operational efficiency


Putting it all Together

The digital era is rapidly changing with the evolving brainstorming technologies every passing day and so as the healthcare industry. The data-driven business intelligence tools and analytics enhances healthcare performance, revenues, and patient experience. Initiating business intelligence in your organization can help you boost your revenues and also improves customer satisfaction.

The upcoming technologies like predictive analytics, Artificial Intelligence, and Machine Learning are revolutionizing healthcare standards and affecting our lives to a great extent.

It’s time for all of us to get digitized and be prepared for miracles going to happen in the coming years.

What is data science, once again…

According to Wikipedia, data science is an inter-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from many structural and unstructured data. All or nothing at all. This sounds like a chorus of a teen-pop song. The problem is that when the companies hire a data scientist, especially during the digital transformation phase, they what it all. What they really get instead is a statistician, a computer scientist, an analyst, a data engineer, an IT project manager, you name it. From my perspective we still miss the clarity to distinguish between operative data science and research and development driven data science.

An operative data science has a superior aim to support operations of a company. This person enables and ensures data-driven decision making on the operative level. And then there is a person who does all the fancy stuff, like machine learning and artificial intelligence – the research and development data scientist. Both may have similar training and experience but clearly different stakeholders. An operative stakeholder wants support to create and easily access good quality data for the reports and decision making. Nearly 90% of the time is spend by most data scientist in industrial companies to coop with data. Therefore 90% of your time, you are a data engineer. The reports and visualizations is the other part. These low hanging fruits are necessary, operations require them. Sometimes method equals business value. But more often the value is determined by the organizational needs of a company, not by the method. Dashboards and reports are relatively simple and easy from a data scientist perspective, but their impact is high.

A data scientist located in research and development department has stakeholders who would like to dig deeper, to support the vision of a company with strategic ideas, inspired and initiated by data. This may be a fresh data-based view on the customer structure, product portfolio or even production or distribution site. Assumed you have a good quality data, data science team can apply a huge variety of more sophisticated, machine learning and AI methods. Of course, these methods can be applied on the operative level as well. But the experience shows that the operative data scientist just does not have enough the time, from the development to the implementation of such products.

What is a data scientist? The answer is, it depends. But it is definitely not all in. We focus our training and experience on certain areas. Hence, dear companies, when you open a data scientist position be precise where do you want to locate a data scientist. If you are looking for an operational support, this person should be hands-on in data engineering techniques, analysis method and visualization. In research and development, you would require somebody who strongly  advances in machine learning and AI.

Will cloudera be more or less relevant to your org a few years from now?

My org thinks we’re in about the right place to adopt Cloudera. We understand there’s a bit of bad juju around it because of unnecessary adoption a few years ago, but it seems like they have a solid multi cloud offering.

That said, we try to be forward looking and only adopt tools that will maintain their relevance a few years down the line.

View Poll

submitted by /u/them_russians
[link] [comments]

Scroll to top