Information Retrieval

Course ID
CEID_NE5597
Department
Division of Computer Software
Level
Undergraduate
Professor
MAKRIS CHRISTOS
Semester
Winter
ECTS
5
  • Introductory notions (user modeling, document logical representation, retrieval process).
  • Performance evaluation metrics (recall, precision, average precision, R-precision, precision histograms, NDCG metric, harmonic median, user-oriented metrics).
  • Information retrieval modeling.
  • Set-oriented models (boolean models, fuzzy set model, extended boolean model), algebraic models (vector space models, latent semantic indexing model, topic models), probabilistic models (classical and language models).
  • Web information retrieval and its peculiarities.
  • Web search engines (crawler, indexer). HITS algorithm (Hyperlink-induced topic search). Google search engine (the PageRank metric). The SALSA algorithm, variants in web searching
  • Machine Learning Techniques and Neural Models in Information Retrieval (Learning to Rank, vector representation of words and word embeddings such as word2vec, CBOW, skipgram, Transformers, BERT, GPT, Large Language Models and sparse vs dense search, Vector Search as in e.g. FAISS (HNSW etc.), using dense and/or sparse search in Retrieval Augmented Generation (RAG), Search Engines against Reasoning Engines).
  • Indexing structures (inverted files, signature files, bitmaps).
  • Storage Techniques in Distributed Information Retrieval (MapReduce, Apache Spark)
  • Full indexing structures in main memory (suffix trees, suffix arrays, acyclic directed graphs (DAWG) for strings), and in secondary memory (supra-suffix array, prefix Β-tree, string Β-tree).
  • Compression algorithms for text and for indexing structures.
  • Text Mining and Graph based Models (graph embeddings).
Skip to content