BIG DATA - HPC
CHRISTIAN SAGUEZ (TERATEC)
More and more data are available in all sectors of the economy. To understand this massive amount of data is strategic for research, industry and public administration. In this context, it is essential to develop highly efficient parallel systems both hardware and software to be able to tackle the data, known as “High-Performance Data Analytics” (HPDA). This requires new technologies (ranging from storage to system architectures to new programming models) and new efficient algorithms and applications optimized for HPC system.
This session will present the challenges faced by scientific and industrial HPDA users and discuss the technological and architectural aspects.
Feedback from HPDA in Oil and Gaz
Various examples of HPDA usages in oil and gas illustrate the effort in algorithmic and software to be done (especially in Europe). The impacts of these new types of activities in terms of intellectual property will also be outlined from those examples. And the effort we will have to produce in education to reach the potential promised by HPDA will again be illustrated by oil and gas examples.
High Performance Data Analytics: HPC meets Big Data, what technological impacts?
This talk will focus on diverse technological evolutions needed by the HPDA market: problems to solve, storage, servers, architectures and security aspects, illustrated by a few examples of initiatives and projects already paving the way towards HPDA. (presentation)
Numerical technologies for tomorrow’s agriculture
The rise of numerical technologies enabled the development of decision support tools dedicated to players in agriculture. Such decision support tools can take various form, optimizing cultural practices, assisting plant breeding or forecasting crop productions on regional scales.
In parallel to the development of decision support tools, large data bases have been implemented by many players in agriculture: academic institutions, co-ops, seed companies, first transformation industry.
These databases have an essential part in the development of next-generation decision support tools based on deep learning techniques. For these new types of decision support tools, needs of HPC resources become significant to sustain the analysis of on-going data flow. (presentation)
SAGE: On Building a storage system for
HPC + Big Data use cases
The SAGE project builds a novel storage platform that holistically addresses some of the combined requirements coming out of both extreme computing as well as scientific big data worlds. Indeed SAGE provides one of the "BDEC", or "Big Data - Extreme Computing" storage systems. The talk covers some of the unique aspects of the SAGE architecture that was co-designed with these new types of use cases encompassing synchrotron light sources, nuclear fusion, climate & weather, bio-informatics, etc. (presentation)
The data deluge in high-resolution climate
and weather simulation
Climate and weather simulations produce an increasing amount of data as models come more complex, are run at higher resolution, and in “ensembles” where many variants are run at the same time to establish measures of uncertainty. Most large weather and climate centres will be managing exabytes of data before they have access to exaflop computers. This data deluge raises a number of issues from parallel IO, to post-processing, data analysis and efficient storage, not to mention the need to acquire and manage high-resolution observations to validate these models. International climate modelling intercomparison projects share large amounts of data worldwide. Cloud computing and big-data technologies are having an increasing role to play in handling and accessing these data, but they introduce new problems and challenges both for data producers and user communities. Activities within the European Network for Earth System modelling (ENES) and the Center of Excellence for Simulation in Weather and Climate (ESIWACE) are addressing some of these issues. (presentation)