Ομιλία Seminars on Data Compression (Παρασκευή 20/12/2024, Αίθουσα Α2, Ώρα: 17:00 έως 19:00) :
Speaker:Elias Machairas
Announcement Content:
Data Compression is both a science and an art. It is a science because many important aspects of it, notably entropy encoding which is typically the last step of most if not all compression algorithms, have been thoroughly analysed and solved mathematically. These mathematical underpinnings of Data Compression, i.e. the science part, are collectively called Information Theory, a breakthrough theory invented by Claude Shannon in 1948. Data Compression is also an art because you need to know and understand your data. There is no silver bullet for removing dependencies from arbitrary data, instead we typically employ domain-specific transforms that work empirically well only for specific data, for example, other transforms work well for text data and other transforms work well for images. In this course, we will be concerned with both theory and practice. That is, we will be concerned with real compression algorithms and how they work, but we will also rigorously prove important mathematical results that underpin these algorithms. There will be 2 assignments in total, the first one has to do with compressing dice rolls and is a warm up assignment to set the scene for the rest of the course, and get us familiar with the programming aspect of the course. The second assignment will be on the DEFLATE algorithm which is the compression algorithm behind gzip, zip, png and zlib. We will open up an actual implementation of DEFLATE and tweak things. If time permits, at the last part of the course we will develop parts of a state-of-the-art compressor comprising of a Neural-Network based Large Language Model (LLM) paired with an Arithmetic Encoder.
Lectures and notes will be in English. Handout notes for the first Lecture are attached in this announcement. There will be printed handout notes for each lecture.
Everyone interested is welcome. Also, as this is an independent open course it is aimed to be friendly to people who are thinking to switch careers to Computer Science from clean start and don't know where to start. If on the way you find yourself in need of a more introductory course (as this course closely follows the syllabus of a Data Compression university course typically offered at the 3-rd or 4-th year) pointers and guidance will be given.