Big Data London 2023: Navigating the World of Generative AI and LLMs

by Vince Spadaro - Senior Interaction Designer, Experience Design Team.
| minute read

The Big Data London 2023 conference was an enlightening experience, with Generative AI and Large Language Models (LLMs) emerging as the stars of the show. With the rising interest in Generative AI, the conference expanded its horizons, introducing two new theatres and facilitating 40 additional talks on this subject.

The discussions at the conference primarily revolved around the current challenges faced by LLMs and the innovative solutions the community is exploring.

Large Language Models (LLMs) are celebrated in the tech world for their ability to produce clear and contextually appropriate text. However, they come with their own unique peculiarities. At times, they might produce content that seems hallucinated or fantastical, turning away from hard facts. If you pose the same question differently, you might receive an entirely different response. Why is that?

Diving deeper into the mechanics of LLMs, their behaviour boils down to a foundational principle: ‘they're trained to predict the next word in a sequence’. Imagine them as eager students who, rather than grasping the essence of a question, focus on stringing words together based on patterns they've learned. They don’t necessarily understand your query; they respond based on patterns in the vast amount of text they've been trained on. Haven't we all used similar tactics in school at some point?

When comparing different models like ChatGPT-3 and ChatGPT-4, it's evident that the latter has enhanced capabilities. Take this example: if you confidently told GPT-3 that Germany beat England 5-1 in the 2002 FIFA World Cup qualifiers, it might inadvertently agree, even though it was England who won with that score. GPT-4, being more advanced, is less prone to such missteps. However, simply feeding these models with more data to produce GPT-5, GPT-6, and so on, isn't the ultimate solution. Relying just on the sheer volume of data presents scalability issues, potential biases, and can sometimes miss context. The tech community is realising this and shifting focus. Instead of just accumulating vast amounts of data, there's a growing emphasis on the quality and relevance of the data being used.

A standout revelation from this year's conference was the potential synergy between LLMs and Knowledge Graphs (KGs).

What exactly are knowledge graphs?

KGs are more than just data structures. They are intricate webs of contextual understanding that capture real-world entities and their interrelations in formats that both humans and machines can easily digest. (Building Knowledge Graphs – A Practitioner’s Guide, Jesús Barrasa & Jim Webber)

While LLMs are linguistic maestros, KGs are the meticulous librarians of the tech world, cataloguing information with precision. Marrying the two could be the key to unlocking AI that's not only articulate, but also accurate.

Vectorising data isn't a completely new concept. Industry giants like Spotify, Google, Meta, and Airbnb have been harnessing the power of vectors for years. Vector databases are particularly efficient with single categories of data, such as movies, music or hotels. However, merely embedding vectors in LLMs might not be the optimal approach.

As it stands, LLMs leverage patterns in unstructured data to make probabilistic predictions about language. Integrating LLMs with KGs could allow them to draw from a rich sub-graph instead of just a vector, resulting in a more nuanced response.

Throughout the two-day conference, many experts explored the subject in depth. Though I'm not deeply familiar with databases or text mining, I found the presentations by Advancing Analytics, Neo4j, and Superlinked especially engaging. They skilfully simplified the complexities of the topic for their audience using relatable examples. Moreover, Neo4j has published a book on Knowledge Graphs. If you're keen to delve deeper into the subject, it's available to download for free at the provided link reference section below.

Just when you think you've caught up, along comes Advancing Analytics with a talk about the next big thing: The Retentive Model. Now, I won't pretend to dive deep into this topic, mainly because most of it went over my head but if you're feeling particularly masochistic, there's an official paper available that's chock-full of mathematical jargon proving RetNet's efficiency. Dive in, if you dare, via the provided link in the reference section but don't say I didn't warn you about the math overload!

Shifting away from the technicalities of graphs and vectors, let's ground ourselves in the present reality. Generative AI has accelerated software development. It has lowered the entry barrier, welcoming those without extensive training. This technology automates tasks and empowers individuals to achieve feats without necessarily having the expertise. And the intriguing part? Generative AI isn't even at its full potential yet.

Does this indicate the decline of hard skills? While I couldn't find a definitive answer, the World Economic Forum's - Future of Jobs Report 2023 provides some insight. It lists the top skills to sharpen by 2027:

  1. Creative thinking,
  2. Analytical thinking,
  3. Digital literacy,
  4. Lifelong learning and curiosity,
  5. Resilience, flexibility, and agility,
  6. Systemic thinking,
  7. Mastery of artificial intelligence and big data,
  8. Motivation and self-awareness,
  9. Talent management, and
  10. Service orientation and customer service.

This insightful conference offered a comprehensive exploration of the vast world of Generative AI. As we navigate this expansive AI terrain, the focus is shifting from mere data accumulation to its quality and relevance. Integrating LLMs with Knowledge Graphs is set to refine our tools for deeper insights. Looking forward, a combination of tech skills, creativity, and adaptability will guide us. As we traverse the complex pathways of AI, endless opportunities and challenges lie ahead.



Building Knowledge Graphs – A Practitioner’s Guide

Retentive Network: A Successor to Transformer for Large Language Models

For those eager to catch the talks from Big Data London, recordings will be available on the YouTube channel in just a few weeks. Stay tuned and enjoy!



Related content

Democratising Innovation: Empowering Progress through Generative AI

Innovation, the driving force behind progress in our ever-changing world, has the power to ignite breakthroughs, propel industries forward, and tackle complex challenges. 

The UK AI regulation white paper: The case for digital ethics adoption

Wiktoria Kulik, Sopra Steria’s Senior Digital Ethics Consultant shares her thoughts on the UK government’s proposed approach to regulating AI, and the role of digital ethics in delivering digital services.

Empowering EDF employees through AI integration

Sopra Steria worked with EDF to deploy an innovative system that would unlock the potential of its IT support team and improve the quality of its service.