Contact us
Our friendly team would love to hear from you.
We use cookies
This website uses cookies to provide necessary website functionality, improve your experience and analyze our traffic. By using our website, you agree to our Privacy Policy and our cookies usage.
Dive into the essentials of Natural Language Processing and Large Language Models. This course offers a hands-on approach to NLP workflows, tokenization, and the Hugging Face ecosystem, building your expertise step-by-step. Perfect for software engineers eager to master NLP.
For all parts of the course, its main goal is to familiarize the participants with the NLP workflows and models, including the LLMs, from the software engineering perspective, only introducing the mathematical theory when strictly necessary.
As the whole course was developed to be presented as a whole, each day builds on the knowledge gained in the previous one(s), further expanding the participant’s knowledge of the NLP in general. However, if deemed necessary, sufficiently prepared or experienced users should be able to forgo certain days.
This day is a general introduction to the NLP and the context in which its tasks are performed. The main goal is to discover steps that should be performed before feeding the data to the neural network, talk about how the AI models work with textual data, and introduce the participants to the Hugging Face ecosystem and their libraries which are a staple of NLP workflows.
This day’s course can be divided into two parts. The introduction to the various tools and Hugging Face ecosystem, that are more of a high-level demonstration than a challenge, and the more advanced sections such as the Tokenization, Training custom Tokenizers, Word Embeddings, and Transformer Architecture, some of them with more theory than others. All newly introduced terms and techniques are explained from scratch, thus leaving no one in the dark.
Because of the introductory character of the first part, its difficulty is low, evolving to medium in the aforementioned latter sections.
The beginner audience, with little to no NLP experience, is best suited for this day, since during the course its participants get to know a little bit of everything from the tokenization, through the transformers library and its use cases to the whole Hugging Face ecosystem.
However, even the advanced users, already acquainted with the presented tools may find some of the more theoretical sections interesting and valuable, since an understanding of them is vital for successful participation in the following days of the course.
Throughout the various stages of the workshop we will be introducing ways to deal with the imbalanced datasets, whether it’s during the training or evaluation.
On this day we take another step into the world of the NLP, focusing on one of the most versatile tasks in the field – text classification. We learn about the metrics with which you can measure model’s performance, discover ways to work with imbalanced datasets and most importantly explore different ways to classify text.
The difficulty for this part of the course varies between medium (for the sections like the introduction of metrics, Masked Language Modeling, or working with datasets) and hard (like managing class imbalance, usage of SHAP, and especially working with Torch framework to implement the MLP).
In this part of the course we expect the more advanced participants to thrive, as we introduce more complex ways to work with the neural networks, including devising their architecture on our own with PyTorch.
Less experienced users should also find a lot of interesting parts, such as new metrics, fine-tuning of models with transformers API, or working with the datasets.
In this workshop, we delve into various Token Classification problems, highlighting their main challenges, potential pitfalls, and strategies to overcome them. We’ll cover preprocessing techniques for data, and methods to evaluate results tailored to specific tasks, including leveraging several new libraries.
This particular workshop’s difficulty can safely be evaluated as medium as it mostly leverages what the participants should already be familiar with from the transformers library and pure python on, while providing more details on the theoretical side of things.
We expect this part of the course to appease both the beginner and the advanced participants, as we delve deeper into possible use-cases of NLP and discover another of its tasks, while mostly using the transformers API that was described in detail in the previous parts of the course.
To those more interested in the technicalities of model training and evaluation we introduce the usage of monitoring tools and new evaluation library.
Continuing our journey through the NLP field with the use of the transformer models. This time we focus on the variants that leverage the full transformer architecture, also known as seq2seq models. We will explore two of the tasks those models thrive in: question answering and text summarization, learn how to measure the quality of text generated by a model and try to create a multi-task model able to perform both aforementioned tasks.
Considering the similarity to the previous workshop, this one is also of mediocre difficulty. Just like the former, it contains significant theoretical content, while also touching upon important technical aspects of the systems solving those tasks, once again relying heavily on the transformers library.
As mentioned in the ‘level’ section, due to similarities to the day 3’s workshop, we expect this part of the course to be enjoyable and valuable for participants on all levels of the knowledge tree.
While keeping the formula similar to the previous workshop, we provide new information by introducing a whole new set of NLP problems, classical and innovative metrics to evaluate the performance of models solving those tasks and methods of text generation, important for yet another architecture variant of Transformers.
Venturing away from two previous days, we introduce the participants to the concept of LLMs, prompt engineering, as well as the zero-shot and few-shot learning techniques, translating from fine-tuning a smaller, dedicated model for each task to using one LLM with fitting prompts for all of them.
We compare results achieved by those two approaches on various tasks from previous days and explore newly introduced aspects regarding LLMs.
This workshop acts as an introduction to the wide field of LLMs, thus its difficulty level is low. We aim to make the transition from the smaller models to LLMs as gentle and easy as possible, slowly, but surely building the foundation for the more advanced techniques and aspects of leveraging the biggest neural networks.
While we hope that everyone could gain some useful knowledge from this part of the course, we have to admit that it is mostly aimed at those who, have barely touched upon the Large Language Models in general, as this workshop takes the participants a step lower from using the UI of LLMs available online to being able to tweak model’s hyperparameters and try some new, albeit simple, prompting techniques.
In this workshop we want to refresh and expand participant’s knowledge about the fundamental concepts of the LLMs, talking in detail about the techniques introduced in the fifth day of the first session, as well as adding some new ideas and methods, all in the aim of communicating with the LLMs more efficiently, thus increasing our control over their work.
Treating this workshop as a continuation and expansion of the fifth day from the first session, we can derive that its difficulty increases along the notebook from easy revisions from the former part of the course to medium when introducing new concepts or providing more details on the ones already introduced.
Once again using the last workshop from the first session as a reference point, we assume this part of the course suitable for those with limited knowledge in the field of LLMs. It would prove especially useful for those who skipped the previous workshop altogether.
That being said, we did our best to develop this course in such way, that even more advanced participants could learn something new from every day and every session.
In this workshop, we explore various techniques and strategies for effective prompt crafting, ensuring that our communications harness the full potential of these powerful tools. We can split those approaches into two main categories based on their intent.
As we continue our journey into the intricacies of artificial intelligence, Day 2 shifts focus to the art and science of Prompt Engineering. This critical skill set involves crafting specific inputs that guide Large Language Models (LLMs) to generate desired outputs with higher precision and relevance. Prompt Engineering is not merely about asking questions; it’s about formulating them in a way that aligns closely with the model’s training and capabilities. Understanding this can significantly enhance the quality of interactions with LLMs, enabling more accurate and contextually appropriate responses.
As we move onto more complex and complex prompt engineering techniques the level also increases from fairly easy (ex. CoT prompting), through intermediate (like Tab-CoT) to really hard (see RAG and ReAct).
Since the difficulty and advancement of the described techniques progresses through the workshop, we can guarantee that every participant will find their own niche and learn something new from it. While remaining in scope of transformers library in the technical aspect, this part of the course is more focused on the theoretical approach, since we introduce a lot of prompt engineering techniques.
Diving deeper into the LLMs field, in this particular workshop we explore several new aspects of it, LLMs acting as Agents, and also how to evaluate and fine-tune LLMs, since their size and complexity vastly differs them from small task-specific models, creating both theoretical and technical challenges.
Due to the introduction of the new frameworks and tools, as well as because we broach new, more complex tasks, the difficulty of this workshop is estimated as hard.
Since we delve into new, more sophisticated and technical territories we recommend this part of the course to the more advanced participants, especially with more experience when it comes to ML frameworks such as LangChain.
In this workshop we dig even deeper into the fine-tuning of the LLMs, focusing on the processe’s efficiency and its another important factor – alignment, aiming to enhance this very important part of the LLMs’ deployment.
With the introduction of yet another set of tools and approaches, that also build up on the knowledge gained in the prior days (especially the third day of the second session), the level remains the same, deeming this workshop as hard, especially to those with limited coding experience and the ones who omitted the previous day.
Once again, maintaining the difficulty of the previous workshop, we are obliged to emphasize that this part of the course is prepared with the advanced participants, who aim to customize their LLM-based solutions, in mind.
In the final day of the second session we move even lower in the technology stack of the LLM-based systems. This time we talk about quantization methods, differences between training and inference of the models and explore methods that aim to speed-up those activities.
As we reach closer and closer to the hardware beneath the ML, slowly venturing into the ML-Ops field, the level of difficulty increases again, making this workshop really hard, especially on the technical side.
To not repeat ourselves, we recommend this particular workshop for those deeply interested in building and deploying the systems that leverage the LLMs. It’s also important to mention that any previous experience with the aforementioned tools is crucial.
In this session as a whole we will venture into a specific, highly sophisticated field of LLM usages called Retrieval Augmented Generation (RAG).
We will learn in detail what this fancy sounding term really means, how we can leverage such solution in the real world, what’s the difference between using a normal LLM and a RAG system, what components does such tool consist of and how to build it.
After getting to know the basics of Retrieval Augmented Generation yesterday, we’re ready to dive deeper into its architecture. We’ve already built a demo from pre-prepared parts, but what if those elements don’t satisfy our needs? In this session we will explore most popular technologies and frameworks that allow us to create our own components.
We will start with building a simple embedding-based retrieval which will serve as a baseline. We will then evaluate it on the test set and analyze what kind of errors it makes. This will help us understand the limitations of the simple retrieval model and the data we are working with. We will also explore how the choice of chunking strategy can affect the retrieval performance. Next, we will go back to the basics and learn about lexical search and when it can be used to improve retrieval. Then, we will add another component to our retrieval pipeline: the cross-encoder. We will learn how to use it and how it can improve the retrieval performance. Finally, we will come back to the embedding-based retrieval and see how to fine-tune it to further improve the performance.
We will start with an overview of the metrics used to evaluate the generated responses. In particular, we will focus on how to use Large Language Models (LLMs) to analyze the generated responses and compare them to the ground truth. Then, we will do a short recap and build a simple RAG system which will serve as a baseline. We will evaluate it on the test set and analyze what kind of errors it makes. In the next stage, we will explore the context created from documents returned by the retrieval model. We will analyze how the quality of the context affects the generation performance. Next, we will fine-tune the generation model to align it to expected answers and improve the generation performance. Finally, we will explore different extensions to the RAG model.
With Cognitum, you can confidently launch your project. Ensure scalability, security, performance and design with our product experts on side.
Get in touchLarge Language Models (LLMs) are a type of machine learning model for natural language processing. They’re trained on a large amount of text data and can generate human-like text based on the input they’re given.
A private LLM is a large language model that is a model exclusively utilized by a specific organization. This guarantees data security and privacy as the model and its associated data are not disseminated to other entities.
Yes, especially when you use private LLMs. These models are not shared with other entities, ensuring your data remains secure and complies with your stringent data policies.
Yes, LLMs can be seamlessly integrated with clients’ environments such as databases, websites, mobile apps, messaging apps, customer support platforms, and more.
To start implementing LLMs, reach out to us at Cognitum. We’ll discuss your specific needs and how our solutions can help you achieve your goals.
A Generative AI application is a type of artificial intelligence that creates new content. It’s based on patterns and structure of their input training data and then generates new data.