Interview with Patrick Valduriez

Top database researcher and new scientific advisor at LeanXcale, Patrick Valduriez, talks to us about the latest trends on database world, his new position at LeanXcale, and the upcoming edition of the best-selling book he co-authors with Professor Tamer Özsu, from the University of Waterloo in Canada.

Hi Patrick, and thank you very much for being here with us. It’s a pleasure to talk with you about the databases world. But first, we want to know more about your incorporation to LeanXcale.

Q. You have recently joined LeanXcale as Scientific Advisor. Why did you take that step?

Patrick Valduriez: It is a great opportunity for me, and probably the right time, to go deeper in applying the principles of distributed and parallel databases on real-world problems. LeanXcale has a disruptive technology that can make a big difference on the DBMS market. I am pleased to be part of this exciting adventure and have the chance to work with a great team of researchers and engineers in Europe.

 

Q. How did you meet LeanXcale for the first time?

Patrick Valduriez: I first met Professor Ricardo Jimenez-Peris (LeanXcale’s CEO and founder) in 2005 at a VLDB workshop on database replication, where we both gave talks. After some discussion, it became obvious that both of us could learn much from each other. Ricardo is a leading expert in transaction management and database replication, which nicely complements my expertise in distributed and parallel query processing. Thus, we started doing joint research on distributed and parallel data management and became good friends since then. In 2013, Ricardo invited me to participate in the CoherentPaaS European Project, in which we developed the CloudMdsQL polystore. During the project, LeanXcale was created. Since then, our collaboration has continued, producing excellent research results.

 

Q. Could you give us an outlook of the database market nowadays?

Patrick Valduriez: For the last 30 years, the database market has been dominated by relational DBMSs, which have proved effective in mission-critical application domains (e.g., transaction processing and business intelligence). In particular, the SQL language has fostered their wide adoption, both from tool vendors and application developers. However, with the advent of big data, RDBMSs have been criticized for their “one size fits all” approach. As an alternative solution, more specialized NoSQL DBMSs, such as key-value stores, document stores and graph DBMSs, have emerged, able to scale out in large clusters of commodity servers. However, scalability has been typically achieved by relaxing database consistency. NewSQL is a recent class of DBMS that seeks to combine the scalability of NoSQL systems with the strong consistency and usability of RDBMSs. An important class of NewSQL is Hybrid Transaction and Analytics Processing (HTAP) whose objective is to perform real-time analysis on operational data, thus avoiding the traditional separation between operational database and data warehouse and the complexity of dealing with ETLs. LeanXcale is at the forefront of the HTAP movement, with a disruptive technology that provides ultra-scalable transactions, polyglot queries, key-value capabilities, and many others.

 

Q. You have been involved in the startup world before. How was that experience?

Patrick Valduriez: In the 1990s, I managed Dyade, a joint venture between Bull and Inria, to foster the development of core technologies in information systems. Dyade was a great success, with some major technology transfers into Bull products and four startups that are still in business (TrustedLogic, Kelkoo, Jalios and Scalagent). I was directly involved in the transfer of the Disco technology (Internet data integration system), which I developed with my Inria team, to KelKoo, a successful price comparator. I learnt a lot from this experience, in particular that, in addition to excellent research results, strong knowledge of the business domain and good vision of the future are critical.

 

Q. With Professor Tamer Özsu (University of Waterloo, Canada), you are co-author of “Principles of Distributed Database Systems”, the bestselling textbook on the topic. From the first edition published in 1991 to the upcoming fourth edition, how has the world of distributed database systems evolved?

 

Patrick Valduriez: Distributed database systems have moved from a small part of the worldwide computing environment a few decades ago to mainstream today. The editions of the book reflect such impressive evolution. The first edition describes relational distributed database systems, involving just a few geo-distributed sites. The second edition introduces single site distributed database systems, also called parallel DBMSs, and object-oriented distributed database systems. The third edition reflects an accelerated investigation of distributed data management technologies over the preceding period in the context of P2P, cluster, XML, data streaming, Web data integration systems and cloud computing. As a result, the book has become quite big (850 pages).

 

Q. What are the main updates on the new edition of the book?

Patrick Valduriez: First, to make room, we removed some background material, which is now well presented elsewhere, and reorganized and updated previous chapters. Second, we added new material on recent hot topics such as big data, NoSQL, NewSQL, polystores, web data integration and blockchain. As a short preview, note that there is a section on LeanXcale’s ultra-scalable transaction management approach in the transaction chapter and another section on LeanXcale’s architecture in the NoSQL/NewSQL chapter. My co-author and I thought these deserved to be in the book.

 

Q. As Scientific Advisor of LeanXcale, what is your role?

Patrick Valduriez: I see my role as a sort of consulting chief architect for the company, providing advice on architectural and design choices as well as implementation techniques. I will also do what I like most, i.e., teach the engineers the principles of distributed database systems, do technology watch, write white papers and blog posts on HTAP-related topics, and do presentations at various venues.

 

Q. What are you currently working on at LeanXcale?

Patrick Valduriez: The first topic is query optimization, based on the Calcite open source software, where we need to improve the optimizer cost model and search space, in particular, to support bushy trees in parallel query execution plans. The second topic is to add a JSON data type in SQL, inspired by the now famous SQL++ language, in order to combine the best of relational DBMS and document NoSQL DBMS.

 

Q. Is there anything else you would like to mention?

Patrick Valduriez: Well, the adventure just got started and it is already a lot of fun. I like to learn from real problems, and LeanXcale has great use cases to satisfy my curiosity and creativity. I want to thank the company for its trust in me.


 
Patrick.jpg

Patrick Valduriez

Patrick Valduriez is a senior scientist at Inria in France. He has been a scientist at Microelectronics and Computer Technology Corp. in Austin (Texas) in the 1980s and a professor at University Pierre et Marie Curie (UPMC) in Paris in the early 2000s. He has also been consulting for major companies in USA (HP Labs, Lucent Bell Labs, NERA, LECG, Microsoft), Europe (ESA, Eurocontrol, Ask, Shell) and France (Bull, Capgemini, Matra, Murex, Orsys, Schlumberger, Sodifrance, Teamlog). A successful career that has been recognized with prestigious awards and prizes, such as the 1993 IBM scientific prize in France, the VLDB2000 best paper award and the 2014 Innovation Award from Inria – French Academy of Science – Dassault Systems. Now, as a part-time consulting job, he engages in a new adventure as Scientific Advisor at LeanXcale.

My visit to the MIT

Originally posted by Ricardo Jimenez on LinkedIn

https://www.linkedin.com/pulse/my-visit-mit-ricardo-jimenez-peris/

20190318_154406 (1).jpg

Taking advantage of my visit to Boston due to our stand at Enterprise Data World, I visited a couple of groups in MIT from the Media Lab.

The visit was really interesting. I was first with Esteban Moro, who is doing a very interesting work about extracting geo-temporal insights for users of an app at very large scale. It turns out to be a very interesting use case for LeanXcale database and make use of its dual interface, key-value and SQL. The problem to be solved implies to ingest data at very high rates and very large volumes what makes the key-value interface quite suitable, while at the same time making many queries, based on keys, ranges and distance.

Later, I met with Luis Alonso, who is doing an amazing work with the next evolution of smart cities. Till now most of what I have seen around smart cities, it is just small data and not providing any appealing insight. However, they are taking a holistic approach in which they have managed to involve different data providers at different cities to really discover interesting insights that can help to improve the efficiency of cities. In the photo with him, you can see one of the models of the smart cities they have. They use Lego blocks to build a 3d model of the city, and then with an overhead projector they project the behaviour of people segmented according different parameters, such as nationality, gender, age, and one can see during different events how the behaviour is different across population segments. Really interesting work and turns out to be also a very interesting use case for LeanXcale since they require correlating massive amounts of data in real-time.

Top database researcher Patrick Valduriez joins LeanXcale team

We are happy to announce that Patrick Valduriez has now joined LeanXcale as scientific advisor!

Patrick.jpg

Patrick is a senior scientist at Inria, a world famous research organization in computer science and mathematics in France. He has also been a senior researcher at Microelectronics and Computer Technology Corp. in Austin (Texas) in the 1980s and a professor at University Pierre et Marie Curie (UPMC) in Paris in the early 2000s. He has made extensive research contributions to distributed and parallel database systems and successful technology transfers to companies.

He has been a consultant for major companies in USA (HP Labs, Lucent Bell Labs, NERA, LECG, Microsoft), Europe (ESA, Eurocontrol, Ask, Shell) and France (Bull, Capgemini, Matra, Murex, Orsys, Schlumberger, Sodifrance, Teamlog). Now, LeanXcale joins this impressive list.

Patrick has received prestigious awards and prizes, including the best paper award at the VLDB conference in 2000, the IBM France scientific prize in 1993 and the Innovation Award from Inria-Académie des Sciences-Dassault Systems in 2014. He is an ACM Fellow.

Patrick is now involved in the launch of the fourth edition of the best-selling book “Principles of Distributed Databases”, co-authored with Professor Tamer Özsu from University of Waterloo. Since its first edition back in 1991, the book has become the leading textbook on distributed data management.  The new edition will feature new material on recent hot topics such as big data, NoSQL, HTAP, web data integration and blockchain.

In this new adventure as advisor, Patrick will share his knowledge with LeanXcale’s engineers to improve their HTAP SQL engine and evangelize the company’s disruptive technology.