Month: April 2021

How data engineering might change over the next 5 years ūüď∑

We interviewed different people working in data engineering to talk about the future of the data analytics space. What was particularly interesting in this exercise was how differently those interviewed thought about the future of the space. We’ve ÔĽŅheard everything from streaming to cataloguing to monitoring as future areas that teams believe will become front and centre over the next five years. Below are the top three takeaways we had from the interviews presented in the report.

Specialization will grow within the data team

Most data engineers and data analysts are wearing many hats today. This is because the investment into the data team has only recently increased. As the value of data teams becomes more evident and more investment is placed in this department, data teams will specialize to focus on a particular function. This could mean having a reliability data engineer, a visualization lead and a separation between backend and frontend data engineering teams. We believe these kinds of organizational changes will begin to take shape over the next 5 years.

The “data gap” between data producers and consumers will shrink

As more investment is directed towards self-service analytics, the gap between data consumers and data producers will continue to shrink. Tools that help teams centralize an understanding of data will become mandatory across all data teams. We’ve solved storing data, and moving data, as well as visualizing data. When we look at the challenges that a team faces today, the idea of self-serve analytics and understanding is the next largest issue.

Data will become a product

More data teams will adopt practices that help them ÔĽŅmeasure, manage and develop data like a product team. On the surface, this might mean a transition towards agile project management. At a more intricate level, this might mean transitioning towards data tools that enable cross-organization collaboration, version control and monitoring. We believe that innovation in this area of data analytics will be interesting.

If you’re interested in the future of data analytics and want to see the full transcripts, you can read the entire report here. If you’re interested in the article with key takeaways, you can check it out here:

submitted by /u/secodaHQ
[link] [comments]

Decoupled headless BI stack with PostgreSQL – we want your feedback on the product!

This article demonstrates a decoupled headless BI stack that can be deployed to a Kubernetes cluster or to a Docker container on your local machine. We released this free community cloud-native version of our platform a few weeks ago:

Your feedback on the product will be valuable. The docker image comes with Postgres containing demo data, but you can use your own data source.



submitted by /u/xMort
[link] [comments]

Breached Online Ordering Platforms Expose Dozens of Restaurants

Editor’s Note: The following post is an excerpt of a full report by Gemini Advisory. To read the entire analysis, click here to view the full report. 


The dark web marketplace for stolen payment cards has been transitioning from the Card Present (CP) space to the Card Not Present (CNP) space for years, meaning that cybercriminals increasingly target online transactions instead of in-person purchases. This has accelerated during the lockdown measures related to COVID-19 since the volume of CNP transactions spiked at the expense of CP transactions. Restaurants are the latest example of this trend. They were formerly largely targets of CP fraud, but hackers have now set their sights on online ordering platforms.

In the past 6 months Gemini has reported on breaches of 5 restaurant services companies that resulted in compromised payment cards offered for sale in the dark web. These breaches exposed approximately 343,000 compromised payment cards and affected 70 restaurants out of the more than 900 restaurants downstream from the platforms. Their median prices ranged from $5-$10 depending on the breached restaurant service company, and they primarily affected US-based banks.
All 5 companies offer online ordering through centralized platforms. These services operate according to two different models. The first model is a third-party service that the restaurant uses as its own infrastructure for placing orders. These platforms are offered alongside physical restaurant point-of-sale (POS) solutions. The second model is a third-party service that operates as an additional option to complement the restaurant’s infrastructure, like regional versions of popular services such as Grubhub and DoorDash.

Gemini observed 3 restaurant services companies operating under the first model. They operate alongside POS systems and basically centralize the function of online ordering, but decentralize the transaction processing. The order and payment card information is entered via the restaurant’s portal, which is hosted on the service provider’s domain. Once completed, the order and payment card details are forwarded to the restaurant for acceptance and processing via the restaurant’s POS system, which results in a CNP transaction using the restaurant’s merchant information.

The other two online ordering service providers take the online order, collect the payment information, and process the transaction on their own system utilizing their own Merchant Name and Merchant ID (MID). This is the more common model seen in use by restaurant ordering and delivery service providers such as DoorDash and Grubhub.

Key Findings

  • ¬†In the past 6 months Gemini has reported on breaches of 5 restaurant services companies that offer online ordering through centralized platforms. This has led to the exposure of approximately 343,000 payment cards.
  • These services include MenuSifu, E-Dining Express, Easy Ordering, Food Dudes Delivery, and Grabull, which operate on two distinct models but all allow cybercriminals access to payment card transactions from customers ordering from participating restaurants in the event of a breach. Gemini found exposure related to orders at 70 restaurants, although over 900 restaurants were downstream from the platforms.
  • As can be seen across all five restaurant service providers, there is a tendency for geographic concentration in the vicinity of the service providers‚Äô headquarters. The breaches affected most regions of the United States, although the highest concentrations were in New England and the Midwest.
  • Attacks such as these are appealing because breaching a single website can compromise transactions at dozens of restaurants. Due to the lucrative nature of successful breaches of online ordering platforms, cybercriminals are likely to continue attacking these merchants.

Editor’s Note: This post was an excerpt of a full report by Gemini Advisory. To read the entire analysis, click here to view the full report.

The post Breached Online Ordering Platforms Expose Dozens of Restaurants appeared first on Recorded Future.

The Business of Fraud: Deepfakes, Fraud’s Next Frontier

Insikt Group

Editor’s Note: The following post is an excerpt of a full report. To read the entire analysis, click here to download the report as a PDF.

Recorded Future analyzed data from the Recorded Future¬ģ Platform, dark web, information security reporting, and other open-source intelligence (OSINT) sources to identify the use and prevalence of how threat actors are attempting to advertise, discuss, sell, and purchase deepfake-related services and products that facilitate fraudulent activities. In this report, we define deepfakes as synthetically generated visual and audio content that is being used offensively to target individuals, companies, and security systems. This report is part of our series on the business of fraud.

Executive Summary

Threat actors have begun to use dark web sources to offer customized services and tutorials that incorporate visual and audio deepfake technologies designed to bypass and defeat security measures. Furthermore, threat actors are using these sources, as well as many clearnet sources such as forums and messengers, to share tools, best practices, and advancements in deepfake techniques and technologies. As reported by Insikt Group’s Criminal and Underground Team throughout 2020, threat actors are developing customized deepfake products.

We believe they will continue to develop these products, as the demand is likely to increase due to corporations incorporating visual and audio recognition technologies into their security measures. Within the next few years, both criminal and nation-state threat actors involved in disinformation and influence operations will likely gravitate towards deepfakes, as online media consumption shifts more into ‚Äúseeing is believing‚ÄĚ and the bet that a proportion of the online community will continue to be susceptible to false or misleading information.¬†¬†

Key Judgments

  • Deepfake technology used maliciously has migrated away from the creation of pornographic-related content to more sophisticated targeting that incorporates security bypassing and releasing misinformation and disinformation. Publicly available examples of criminals successfully using visual and audio deepfakes highlights the potential for all types of fraud or crime, including blackmail, identity theft, and social engineering.
  • English- and Russian-language dark web forums were identified as the main sources for users to advertise, discuss, share, and purchase deepfake-related products, services, and topics. The most widely used forums were found to be low- to mid-tier forums that have lower barriers to entry, but activities were also found on high-tier forums. Deepfake topics were also identified on Turkish-, Spanish-, and Chinese-language forums.
  • The most common deepfake-related topics on dark web forums included services (editing videos and pictures), how-to methods and lessons, requests for best practices, sharing free software downloads and photo generators, general interests in deepfakes, and announcements on advancements in deepfake technologies.¬†
  • There is a strong clearnet presence and interest in deepfake technology, consisting of open-source deepfake tools, dedicated forums, and discussions on popular messenger applications such as Telegram and Discord.
  • Discussion on most publicly available forums and messengers relating to deepfakes surrounds the education and genuine interest in deepfake technology, in addition to users sharing content and refining their craft, in line with discussions identified on closed dark web sources. In the future, we believe that this otherwise relatively benign community can serve as a basis for individuals to venture into illicit criminal activity using learned deepfake skills.

Editor’s Note: This post was an excerpt of a full report. To read the entire analysis, click here to download the report as a PDF.

The post The Business of Fraud: Deepfakes, Fraud’s Next Frontier appeared first on Recorded Future.


The complainant requested information relating to a planning application. Somerset County Council, (the Council), provided some information within the scope of the request but denied holding further information. The complainant considered that the Council held further information within the scope of his request. The Commissioner’s decision is that, on the balance of probabilities, the Council is correct when it says that it holds no further information within the scope of the request. The Commissioner requires no steps to be taken as a result of this decision


The complainant requested an advice note relating to a planning matter. Armagh City, Banbridge and Craigavon Borough Council (the ‚ÄėCouncil‚Äô) initially refused to provide the advice note, citing Regulation 12(5)(b), (the course of justice and inquiries), of the EIR. Following its internal review, the Council revised its position and instead relied upon Regulation 12(5)(f) (interests of the information provider) and said that the public interest favoured withholding the requested advice note. The Commissioner‚Äôs decision is that the Council was correct to handle this request under the EIR. She finds that Regulation 12(5)(f) of the EIR is engaged and that the balance of the public interest favours maintaining the exception. By failing to carry out its internal review within the statutory 40 working days‚Äô limit, the Council has breached Regulations 11(4) and 11(5) of the EIR. No steps are required as a result of this notice.

Can you learn Big Data whitout having to study a specific dregree?

I will explain myself. I am finishing a degree and I am deeply interested in big data, machine learning and so on. I do not want to start another one so, I would like to know if it is possible to achieve the same level of knowledge if I study it by myself.

Also, is there any good book to start studying it?

submitted by /u/caraclaklas
[link] [comments]

Airflow orchestration console batch apps

Is it possible to orchestrate net console apps which have a dependency on other console apps. Are there any operators to invoke .net console apps?

submitted by /u/GK-Cosmos
[link] [comments]

Stay Ahead of Global Uncertainty With Real-time Geopolitical Intelligence for Esri

For a complete understanding of the threats to their organizations, all intelligence analysts must consider how geopolitical events ‚ÄĒ such as a global pandemic, terrorist attack, or natural disaster ‚ÄĒ will impact their organization, supply chain, and industry. They need to respond swiftly to these threats, to mitigate disruptions to operations and protect their assets, but organizations are still susceptible to being blindsided at the most inopportune times because intelligence often lags.

Relying on disparate data sources and manual processes means insights are often incomplete or outdated. Most analysts spend too much time manually collecting, analyzing, and visualizing a vast amount of intelligence ‚ÄĒ not to mention translating information from news sources in these regions‚Äô local languages. To monitor and respond to geopolitical threats in real time, teams need a more efficient and collaborative way to report on relevant insights that drive more informed decision-making.

A Comprehensive View of Your Physical Threat Landscape

Geopolitical Intelligence from Recorded Future accelerates critical decision making with contextual open data on geopolitical threats and trends ‚ÄĒ empowering you to protect your assets and understand shifting dynamics in the geographic areas that matter to your organization. Recorded Future eliminates manual research and surfaces intelligence in real time, providing a comprehensive view of your physical threat landscape ‚ÄĒ anywhere in the world.¬†

Recorded Future automates real-time monitoring, collection, and analysis of data from the broadest range of sources, including social media, open source, dark web, and more. By dynamically linking, categorizing, and updating this information in real time, Recorded Future delivers intelligence that is consumed easily by analysts for rapid detection and analysis of risks to physical assets. 

Armed with this location-based intelligence in every language¬† ‚ÄĒ with visibility into historical risk levels and supporting evidence¬† ‚ÄĒ analysts can rapidly analyze events and confidently take action based on these insights to protect their organization‚Äôs assets. Location-based watch lists, real-time risk scores, and centralized search capabilities surface relevant intelligence for fast threat detection and robust reporting. And today, Recorded Future announced major enhancements with the inclusion of Esri ArcGIS integration to its Geopolitical Intelligence module, simplifying and accelerating risk monitoring and response workflows with even more high-confidence geopolitical insights at your fingertips.

Real-Time Location and Global Event Monitoring with Esri

Relevant insights, updated in real time, and visualized on Esri dashboards drive faster, more informed decisions. The Recorded Future integration for Esri positions real-time intelligence directly within industry-leading mapping and geospatial analytics software so analysts can take immediate action to secure their people, facilities, and products in an ever changing world. 

By automatically layering Recorded Future’s geopolitical event and location intelligence in Esri ArcGIS, analysts are empowered to rapidly visualize, monitor, and respond to risks to their organization’s operations and assets. Esri analysts can then use Recorded Future intelligence in tandem with many other data sources within their Esri instance to gain a comprehensive view of their physical threat landscape. For deeper analysis of threats, Esri analysts can easily pivot to the Recorded Future Portal, where they can conduct tailored searches with the Advanced Query Builder and filter Recorded Future’s geopolitical intelligence by keywords, event types, sources, time frames, and more.

To learn how you can accelerate location risk analysis with Esri, request a demo of Recorded Future’s Geopolitical Intelligence module today!

The post Stay Ahead of Global Uncertainty With Real-time Geopolitical Intelligence for Esri appeared first on Recorded Future.

How might a csv file be ingested in a data lake via pipelines?

What would the general flow chart be to add a csv to a data lake deplayed, for instance, on S3? How would it be stored, extracted, and loaded? I’m brainstorming the architect for a data pipeline system driven off a data lake.

submitted by /u/KimJongUhn
[link] [comments]

Scroll to top