<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Series on My Learning Notes</title>
    <link>https://learning-notes-dz2.pages.dev/categories/series/</link>
    <description>Recent content in Series on My Learning Notes</description>
    <generator>Hugo -- 0.124.0</generator>
    <language>en</language>
    <lastBuildDate>Tue, 16 Jun 2026 07:19:01 +0000</lastBuildDate>
    <atom:link href="https://learning-notes-dz2.pages.dev/categories/series/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>SafeDPO and Friends: Preference Optimization That Doesn&#39;t Sacrifice Safety</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2026-03-30/</link>
      <pubDate>Mon, 30 Mar 2026 00:00:00 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2026-03-30/</guid>
      <description>DPO has problems — preference reversals, reward degradation, and a safety-helpfulness trade-off. Here&amp;rsquo;s how SafeDPO, RePO, and other recent variants are fixing them.</description>
    </item>
    <item>
      <title>RLHF Is Just Divergence Estimation in Disguise</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2026-03-22/</link>
      <pubDate>Sun, 22 Mar 2026 00:00:00 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2026-03-22/</guid>
      <description>A unifying view of RLHF, DPO, and Constitutional AI — they&amp;rsquo;re all estimating the divergence between safe and unsafe output distributions. Plus a clean derivation of why DPO works.</description>
    </item>
    <item>
      <title>From Policy Gradient to PPO — Part 2: Trust Regions, PPO, and GRPO</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2026-03-18/</link>
      <pubDate>Wed, 18 Mar 2026 00:00:00 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2026-03-18/</guid>
      <description>How trust regions stabilize policy optimization, why PPO became the default for RLHF, and how GRPO eliminates the critic entirely.</description>
    </item>
    <item>
      <title>From Policy Gradient to PPO — Part 1: Foundations</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2026-03-05/</link>
      <pubDate>Thu, 05 Mar 2026 00:00:00 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2026-03-05/</guid>
      <description>MDPs, value functions, the REINFORCE algorithm, actor-critic methods, and generalized advantage estimation — the RL foundations you need before understanding RLHF.</description>
    </item>
    <item>
      <title>VAE Variants and Modern Interpretations</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2026-02-25/</link>
      <pubDate>Wed, 25 Feb 2026 00:00:00 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2026-02-25/</guid>
      <description>A survey of where the VAE idea went after 2014 — VQ-VAE, hierarchical VAEs, adversarial hybrids, flow-based posteriors — and what the VAE really gave us beyond a specific architecture.</description>
    </item>
    <item>
      <title>Transformers from First Principles — Part 2: What Scale Reveals</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2026-02-20/</link>
      <pubDate>Fri, 20 Feb 2026 00:00:00 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2026-02-20/</guid>
      <description>Sparse attention patterns, head specialization, rotary embeddings, gated attention, and the modern efficiency tricks that make large transformers actually trainable.</description>
    </item>
    <item>
      <title>β-VAE and the Emergence of Disentanglement</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2026-02-10/</link>
      <pubDate>Tue, 10 Feb 2026 00:00:00 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2026-02-10/</guid>
      <description>A single Greek letter in front of the KL term changes what the VAE learns. We look at β-VAE as a rate-distortion trade-off, an information bottleneck, and a simple probe into disentangled representations.</description>
    </item>
    <item>
      <title>Transformers from First Principles — Part 1: Attention Is All You Need (Really)</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2026-02-08/</link>
      <pubDate>Sun, 08 Feb 2026 00:00:00 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2026-02-08/</guid>
      <description>A first-principles walkthrough of the Transformer — self-attention, positional encoding, multi-head attention — with the math that makes it work.</description>
    </item>
    <item>
      <title>Conditional VAE (CVAE): Learning to Generate with Conditions</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2026-01-25/</link>
      <pubDate>Sun, 25 Jan 2026 00:00:00 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2026-01-25/</guid>
      <description>We extend the VAE into a controllable generative model by adding a condition y into every term of the ELBO.</description>
    </item>
    <item>
      <title>Safety Neurons: 5% of Your Model Controls 90% of Safety</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2026-01-18/</link>
      <pubDate>Sun, 18 Jan 2026 00:00:00 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2026-01-18/</guid>
      <description>Mechanistic interpretability meets alignment — how researchers found that a tiny fraction of neurons are responsible for almost all safety behavior in LLMs, and what that means.</description>
    </item>
    <item>
      <title>Dissecting the VAE Objective: KL, Reconstruction, and the Reparameterization Trick</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2026-01-10/</link>
      <pubDate>Sat, 10 Jan 2026 00:00:00 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2026-01-10/</guid>
      <description>We open the ELBO, compute each term, and meet the reparameterization trick — the idea that lets us backpropagate through randomness.</description>
    </item>
    <item>
      <title>Variational Inference: Cracking the Intractable Integral</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2025-12-20/</link>
      <pubDate>Sat, 20 Dec 2025 00:00:00 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2025-12-20/</guid>
      <description>Variational Inference transforms the impossible task of computing intractable integrals into a solvable optimization problem, providing the mathematical foundation for modern generative models like VAEs.</description>
    </item>
    <item>
      <title>Latent Variable Models: A Probabilistic Foundation</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2025-10-28/</link>
      <pubDate>Tue, 28 Oct 2025 00:00:00 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2025-10-28/</guid>
      <description>From PCA to Probabilistic PCA and general Latent Variable Models: the probabilistic lens that seeds VAEs.</description>
    </item>
    <item>
      <title>An overview on generative models paradigms</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2024-12-24/</link>
      <pubDate>Tue, 24 Dec 2024 01:12:07 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2024-12-24/</guid>
      <description>A summary of explicit, implicit and score-based generative models.</description>
    </item>
    <item>
      <title>Information Theory</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2024-10-05/</link>
      <pubDate>Sat, 05 Oct 2024 11:00:00 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2024-10-05/</guid>
      <description>Information theory essentials: entropy, cross-entropy, joint/conditional entropy, KL divergence, mutual information.</description>
    </item>
    <item>
      <title>Diffusion Models</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2024-06-11/</link>
      <pubDate>Tue, 11 Jun 2024 01:12:07 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2024-06-11/</guid>
      <description>Diffusion Models (DMs) include two processes: forward and backward.
Forward process General idea Degrading input data using noise iteratively, forward in time (i.e., $t$ increases). Given image $x_0 \sim q(x_0)$, which called data distribution, forward process gradually adds Gauss noise thru $T$ time steps and produces latent $x_T$. At each time step $t$, we sample Gauss noise that following the distribution $\mathcal{N}(\sqrt{1 - \beta_t} x_{t-1}, \beta_t)$, where the hyper-parameters $0 &amp;lt; \beta_{1:T} &amp;lt; 1$ represent the variance of noise incorporated at each time step.</description>
    </item>
    <item>
      <title>Determinant of matrices, eigenvalues and eigenvectors</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2021-08-21/</link>
      <pubDate>Sat, 21 Aug 2021 01:12:07 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2021-08-21/</guid>
      <description>Determinants, eigenvalues, eigenvectors: geometric meaning, finding methods, and linear transformation essence.</description>
    </item>
    <item>
      <title>Span, basis, and dimension</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2021-08-07/</link>
      <pubDate>Sat, 07 Aug 2021 01:12:07 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2021-08-07/</guid>
      <description>Linear independence, span, basis, dimension: fundamental concepts for vector spaces and subspaces.</description>
    </item>
    <item>
      <title>The four fundamental subspaces in Linear Algebra</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2021-08-14/</link>
      <pubDate>Sat, 07 Aug 2021 01:12:07 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2021-08-14/</guid>
      <description>&amp;#34;This is really the heart of this approach to linear algebra, to see these four subspaces, how they are related.&amp;#34; - Prof. Gilbert Strang</description>
    </item>
    <item>
      <title>Solving Ax = b</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2021-08-01/</link>
      <pubDate>Sun, 01 Aug 2021 01:12:07 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2021-08-01/</guid>
      <description>Solving Ax=b: conditions for solutions, complete solution (particular + nullspace), rank relationships.</description>
    </item>
    <item>
      <title>Nullspace and solving Ax=0</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2021-07-27/</link>
      <pubDate>Tue, 27 Jul 2021 01:12:07 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2021-07-27/</guid>
      <description>Nullspace and solving Ax=0: special solutions, free variables, reduced row echelon form.</description>
    </item>
    <item>
      <title>Echelon Form and Rank of a matrix</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2021-07-25/</link>
      <pubDate>Sun, 25 Jul 2021 01:12:07 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2021-07-25/</guid>
      <description>Echelon form and matrix rank: row elimination, leading elements, and solving linear systems.</description>
    </item>
    <item>
      <title>Vector spaces and subspaces</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2021-07-23/</link>
      <pubDate>Fri, 23 Jul 2021 01:12:07 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2021-07-23/</guid>
      <description>Vector spaces, subspaces, column space: 8 axioms, subspace properties, and linear combinations.</description>
    </item>
    <item>
      <title>Vanishing Gradients</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2021-07-21/</link>
      <pubDate>Wed, 21 Jul 2021 01:12:07 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2021-07-21/</guid>
      <description>What is vanishing gradients? How come it happens and what is the solution to resolve?</description>
    </item>
    <item>
      <title>Basic concepts in Linear Algebra</title>
      <link>https://learning-notes-dz2.pages.dev/posts/2021-07-20/</link>
      <pubDate>Tue, 20 Jul 2021 01:12:07 +0700</pubDate>
      <guid>https://learning-notes-dz2.pages.dev/posts/2021-07-20/</guid>
      <description>Basic concepts of Linear Algebra: data types, notations, and so on</description>
    </item>
  </channel>
</rss>
