Semidefinite Programming and Combinatorial Optimization

AIAA 3225 · Learning and Optimization for Artificial Intelligence

Week 6: SDP and Combinatorial Optimization

Instructor: Prof. Xin Wang

Date: October 16, 2025

Matrix Norms via SDP

Nuclear & Spectral Norms

Maximum Cut Problem

Graph Laplacian & Binary Encoding

Goemans-Williamson

0.878-Approximation Algorithm

Shannon Capacity

Lovász Theta Function

Matrix Norms via Semidefinite Programming

Schatten $p$ -Norms

For a matrix $A \in R^{m \times n}$ with singular values $σ_{1} (A) \geq σ_{2} (A) \geq \dots \geq 0$ :

‖ A ‖_{p} := {\begin{cases} {(\sum_{k = 1}^{min (m, n)} σ_{k}^{p} (A))}^{1 / p}, & 1 \leq p < \infty \\ σ_{1} (A), & p = \infty \end{cases}

Nuclear Norm ( $p = 1$ ): $‖ A ‖_{1} = \sum_{k} σ_{k} (A)$

Sum of all singular values (also called trace norm)

Convex envelope of rank on the unit ball
Used in matrix completion and low-rank approximation
Sparsity-inducing norm for matrices

Frobenius Norm ( $p = 2$ ): $‖ A ‖_{F} = \sqrt{\sum_{k} σ_{k}^{2} (A)} = \sqrt{\sum_{i j} A_{i j}^{2}}$

Euclidean norm of the matrix entries

Also equals $\sqrt{tr (A^{T} A)}$
Natural extension of vector $ℓ_{2}$ norm
Easy to compute directly from entries

Spectral Norm ( $p = \infty$ ): $‖ A ‖_{\infty} = σ_{1} (A)$

Largest singular value (also called operator norm or $ℓ_{2}$ operator norm)

Equals $max_{‖ x ‖_{2} = 1, ‖ y ‖_{2} = 1} y^{T} A x$
For symmetric matrices: $‖ A ‖_{\infty} = max {| λ_{max} (A) |, | λ_{min} (A) |}$
Measures maximum stretching factor

Norm Duality

The nuclear and spectral norms are dual to each other:

‖ A ‖_{\infty} = max_{‖ Y ‖_{1} \leq 1} ⟨ A, Y ⟩, ‖ A ‖_{1} = max_{‖ Y ‖_{\infty} \leq 1} ⟨ A, Y ⟩

where $⟨ A, Y ⟩ = tr (A^{T} Y)$ is the trace inner product

Spectral Norm via SDP

For a symmetric matrix $M \in S^{n}$ , the spectral norm is:

‖ M ‖_{\infty} = max_{‖ x ‖_{2} = 1} | x^{T} M x | = max {| λ_{max} (M) |, | λ_{min} (M) |}

SDP Formulation via Schur Complement

Consider the block matrix:

X (t) = (\begin{matrix} t I_{n} & M \\ M & t I_{n} \end{matrix}) \in S^{2 n}

Key Insight (Schur Complement): $X (t) ⪰ 0$ if and only if:

$t I_{n} ⪰ 0$ (always true for $t \geq 0$ )
$t I_{n} - M (t I_{n})^{- 1} M ⪰ 0$ , which gives $t^{2} I_{n} - M^{2} ⪰ 0$
This holds iff $t^{2} I_{n} ⪰ M^{2}$ , i.e., $t \geq ‖ M ‖_{\infty}$

Primal SDP:

\begin{aligned} minimize & t \\ subject to & (\begin{array}{c} t I_{n} & M \\ M & t I_{n} \end{array}) ⪰ 0 \end{aligned}

Nuclear Norm via SDP

For a general matrix $M \in R^{m \times n}$ , the nuclear norm $‖ M ‖_{1} = \sum_{k} σ_{k} (M)$ :

SDP Formulation

Primal SDP:

\begin{aligned} minimize & \frac{1}{2} (tr (U) + tr (V)) \\ subject to & (\begin{array}{c} U & M \\ M^{T} & V \end{array}) ⪰ 0 \\ U \in S^{m}, V \in S^{n} \end{aligned}

Correctness: At optimum, by complementarity, $U$ and $V$ are tight, giving $U = (M M^{T})^{1 / 2}$ and $V = (M^{T} M)^{1 / 2}$ , hence $tr (U) + tr (V) = 2 \sum_{k} σ_{k} (M)$

Schur Complement Interpretation

The PSD constraint ensures:

$U, V ⪰ 0$
$U - M V^{- 1} M^{T} ⪰ 0$ (when $V ≻ 0$ )
All singular values satisfy $σ_{k}^{2} (M) \leq λ_{i} (U) λ_{j} (V)$

Dual Problem

With dual variable $Y \in R^{m \times n}$ :

max ⟨ M, Y ⟩

subject to:

(\begin{matrix} I_{m} & Y \\ Y^{T} & I_{n} \end{matrix}) ⪰ 0

which is equivalent to $‖ Y ‖_{\infty} \leq 1$

Duality Connection

The unit balls ${‖ M ‖_{\infty} \leq 1}$ and ${‖ M ‖_{1} \leq 1}$ are polar to each other under the trace inner product, reflecting the fundamental duality between spectral and nuclear norms.

The Maximum-Cut Problem

Problem Setup

Weighted Undirected Graph: $G = (V, E, w)$ where:

$V = {1, \dots, n}$ is the vertex set
$E \subseteq {{i, j} : i, j \in V, i \neq j}$ is the edge set
$w : E \to R_{> 0}$ assigns positive weights $w_{i j} > 0$

Cut Value: val (S, S^{c}) = \sum_{\begin{matrix} i \in S, j \in S^{c} \\ {i, j} \in E \end{matrix}} w_{i j}

Real-World Applications:

Social Networks: Community detection and clustering
VLSI Design: Circuit partitioning for chip layout
Statistical Physics: Ising spin glass ground states
Machine Learning: Feature selection and clustering
Image Segmentation: Computer vision applications

MAX-CUT Problem

{maximize}_{S \subseteq V} \sum_{\begin{matrix} i \in S, j \in S^{c} \\ {i, j} \in E \end{matrix}} w_{i j}

This is NP-hard!

No polynomial-time exact algorithm unless P = NP (Karp, 1972)

Simple Example: Triangle Graph $K_{3}$

Complete graph on 3 vertices with unit weights:

Partition $(+, +, +)$ or $(-, -, -)$ : Cut value = 0 (all on same side)
Partition $(+, +, -)$ and permutations: Cut value = 2 (two edges cut)

Optimal value: 2 (any balanced partition)

Binary Encoding and Graph Laplacian

Binary ${- 1, + 1}$ Encoding

Associate to every cut $(S, S^{c})$ the sign vector $x \in {- 1, + 1}^{n}$ :

x_{i} = {\begin{cases} + 1 & if i \in S \\ - 1 & if i \in S^{c} \end{cases}

Key Identity

For any edge ${i, j} \in E$ :

\frac{1 - x_{i} x_{j}}{2} = {\begin{cases} 1 & if x_{i} \neq x_{j} (edge is cut) \\ 0 & if x_{i} = x_{j} (edge not cut) \end{cases}

Since $(1 - x_{i} x_{j}) / 2 = (x_{i} - x_{j})^{2} / 4$ when $x_{i}, x_{j} \in {- 1, + 1}$

Graph Laplacian Matrix

Define $L_{G} \in S^{n}$ such that:

x^{T} L_{G} x = \sum_{{i, j} \in E} w_{i j} (x_{i} - x_{j})^{2} \forall x \in R^{n}

Explicit form:

(L_{G})_{i j} = {\begin{cases} \sum_{k : {i, k} \in E} w_{i k} & if i = j (weighted degree) \\ - w_{i j} & if {i, j} \in E \\ 0 & otherwise \end{cases}

Note: $L_{G}$ is positive semidefinite with smallest eigenvalue 0

Quadratic Reformulation

Step 1: Cut value in terms of indicator

val (S, S^{c}) = \sum_{{i, j} \in E} w_{i j} \cdot \frac{1 - x_{i} x_{j}}{2}

Step 2: Convert to quadratic form

= \frac{1}{4} \sum_{{i, j} \in E} w_{i j} (x_{i} - x_{j})^{2}

Step 3: Express using Laplacian

= \frac{1}{4} x^{T} L_{G} x

Conclusion: MAX-CUT $\equiv$ ${maximize}_{x \in {- 1, 1}^{n}} \frac{1}{4} x^{T} L_{G} x$

SDP Relaxation of MAX-CUT

Matrix Lifting Technique

Given feasible $x \in {- 1, 1}^{n}$ , define $X := x x^{T} \in S^{n}$

Property 1

$X ⪰ 0$

(PSD as Gram matrix)

Property 2

$rank (X) = 1$

(Rank-one structure)

Property 3

$X_{i i} = 1$

(Since $x_{i}^{2} = 1$ )

Key observation: $x^{T} L_{G} x = \sum_{i j} (L_{G})_{i j} x_{i} x_{j} = \sum_{i j} (L_{G})_{i j} X_{i j} = tr (L_{G} X)$

Semidefinite Relaxation

Drop the non-convex rank-one constraint:

\begin{aligned} maximize & \frac{1}{4} tr (L_{G} X) \\ subject to & X ⪰ 0 \\ X_{i i} = 1, i = 1, \dots, n \end{aligned}

Alternative formulation using $tr (L_{G} X) = \frac{1}{2} \sum_{{i, j} \in E} w_{i j} (X_{i i} + X_{j j} - 2 X_{i j})$ :

\begin{aligned} maximize & \frac{1}{4} \sum_{{i, j} \in E} w_{i j} (1 - X_{i j}) \\ subject to & X ⪰ 0, X_{i i} = 1 \forall i \end{aligned}

                            Why It's a Relaxation
                            Feasibility: Every x∈{−1,1}n gives feasible X=xxT
Objective: Same value tr(LGX)=xTLGx
Relaxation: Larger feasible set (dropped rank constraint)
Result: SDP optimum ≥ true MAX-CUT value

                        

Geometric Interpretation

Cut Polytope: $conv {x x^{T} : x \in {- 1, 1}^{n}}$

Elliptope: $E_{n} = {X ⪰ 0 : X_{i i} = 1, i = 1, \dots, n}$

We have: Cut Polytope $\subseteq$ Elliptope

SDP optimizes over the larger elliptope

Fundamental Question: How close is the SDP optimum to the true MAX-CUT value?

Quick Check: Understanding SDP Relaxations

Question: Why is the SDP relaxation always an upper bound on MAX-CUT?

A. Because SDP solvers use approximation algorithms

B. Because the Laplacian matrix is positive semidefinite

C. Because we dropped the rank-1 constraint, enlarging the feasible set

D. Because Goemans-Williamson proved it in 1995

✓ Correct!
The SDP feasible region (elliptope) contains all rank-1 matrices

x x^{T}

from the original problem, but also includes higher-rank PSD matrices. Since we're maximizing, a larger feasible set can only give an objective value ≥ the original optimum. This is the fundamental principle of convex relaxation.

Key Takeaway

Every convex relaxation of a discrete optimization problem provides an upper bound (for maximization) or lower bound (for minimization) on the optimal value. The quality of this bound depends on how tight the relaxation is.

Goemans-Williamson Algorithm

Main Approximation Result (Goemans-Williamson, 1995)

α_{G W} \cdot p_{SDP}^{⋆} \leq v_{MAX-CUT}^{⋆} \leq p_{SDP}^{⋆}

where

α_{G W} = min_{0 \leq θ \leq π} \frac{θ / π}{(1 - \cos θ) / 2} \approx 0.87856

The minimum is achieved at $θ \approx 2.331$ radians (equivalently $t = \cos θ \approx - 0.689$ )

This is the best possible for polynomial-time algorithms assuming the Unique Games Conjecture (Khot et al., 2007)!

Randomized Hyperplane Rounding Algorithm

Input: Optimal SDP solution $X^{⋆}$ with Cholesky factorization $X^{⋆} = V V^{T}$

where $V \in R^{n \times r}$ (with $r = rank (X^{⋆})$ ) has rows $v_{1}, \dots, v_{n}$ that are unit vectors: $‖ v_{i} ‖_{2} = 1$

Algorithm Steps:

Draw random Gaussian vector $g \sim N (0, I_{r})$
For each $i = 1, \dots, n$ : Set $x_{i} := sign (⟨ v_{i}, g ⟩)$ where $sign (0) = + 1$
Define cut: $S = {i : x_{i} = + 1}$ , $S^{c} = {i : x_{i} = - 1}$
Output cut $(S, S^{c})$ with value $\frac{1}{4} x^{T} L_{G} x$

Geometric Interpretation

The algorithm cuts the unit vectors ${v_{i}}_{i = 1}^{n} \subset R^{r}$ by a random hyperplane through the origin with normal direction $g$ . Vertices on the same side of the hyperplane are placed in the same partition.

Hyperplane: {u \in R^{r} : ⟨ g, u ⟩ = 0}

Analysis of Goemans-Williamson

Key Lemma: Expected Correlation

For the rounding $x_{i} = sign (⟨ v_{i}, g ⟩)$ where $g \sim N (0, I_{r})$ :

E [x_{i} x_{j}] = 1 - \frac{2}{π} \arccos (⟨ v_{i}, v_{j} ⟩) = 1 - \frac{2 θ_{i j}}{π}

where $θ_{i j} = \arccos (⟨ v_{i}, v_{j} ⟩) \in [0, π]$ is the angle between $v_{i}$ and $v_{j}$

Proof Sketch

Let $θ = \arccos (⟨ v_{i}, v_{j} ⟩)$ be the angle between $v_{i}$ and $v_{j}$ .

Key insight: $x_{i} x_{j} = - 1$ (different signs) occurs when $g$ falls in a cone of angular width $θ$ on each side.

By rotational symmetry of the Gaussian:

P [sign (⟨ v_{i}, g ⟩) \neq sign (⟨ v_{j}, g ⟩)] = \frac{θ}{π}

Therefore:

E [x_{i} x_{j}] = 1 \cdot (1 - θ / π) + (- 1) \cdot θ / π = 1 - 2 θ / π

Expected Cut Value

Using the identity $val (S, S^{c}) = \frac{1}{4} x^{T} L_{G} x$ :

E [val (S, S^{c})] = \frac{1}{4} \sum_{{i, j} \in E} w_{i j} (1 - E [x_{i} x_{j}])

= \frac{1}{4} \sum_{{i, j} \in E} w_{i j} \cdot \frac{2 θ_{i j}}{π}

Comparison to SDP Value

The SDP optimal value is:

p_{SDP}^{*} = \frac{1}{4} \sum_{{i, j} \in E} w_{i j} (1 - X_{i j}^{*})

Since $X_{i j}^{*} = ⟨ v_{i}, v_{j} ⟩ = \cos θ_{i j}$ :

p_{SDP}^{*} = \frac{1}{4} \sum_{{i, j} \in E} w_{i j} (1 - \cos θ_{i j})

Critical Approximation Bound

Define $α (θ) := \frac{θ / π}{(1 - \cos θ) / 2}$ for $θ \in (0, π]$

Then: $\frac{2 θ}{π} \geq α_{G W} \cdot (1 - \cos θ)$ where $α_{G W} = min_{θ \in (0, π]} α (θ) \approx 0.87856$

Therefore:

E [val (S, S^{c})] \geq α_{G W} \cdot \frac{1}{4} \sum_{{i, j} \in E} w_{i j} (1 - X_{i j}^{*}) = α_{G W} \cdot p_{SDP}^{*}

Recap: Part I – Matrix Norms & MAX-CUT

What We've Covered So Far

📊 Matrix Norms via SDP

Nuclear norm $‖ \cdot ‖_{1}$ and spectral norm $‖ \cdot ‖_{\infty}$
Dual relationship via trace inner product
SDP formulations using Schur complements
Applications to matrix completion

✂️ Maximum Cut Problem

NP-hard combinatorial optimization
Quadratic formulation via graph Laplacian
SDP relaxation through matrix lifting
0.878-approximation via randomized rounding

                    🔑 Key Techniques
                    Convex Relaxation: Drop non-convex constraints (e.g., rank-1) to get tractable problems
Schur Complement: Essential tool for converting matrix inequalities to SDP form
Randomized Rounding: Extract discrete solutions from continuous SDP solutions
Probabilistic Analysis: Prove approximation guarantees in expectation

                

Coming Up Next: Shannon Capacity

We'll explore how SDP techniques apply to information theory and graph coloring problems through the Lovász theta function.

Shannon Capacity and Communication

Noisy Communication Model

Setting: Transmit symbols from alphabet ${1, \dots, n}$ over a noisy channel

Confusability Graph $G = (V, E)$ :

Vertices $V = {1, 2, \dots, n}$ represent symbols
Edge ${i, j} \in E$ means symbols $i, j$ may be confused by receiver
Safe transmission requires independent set (no edges between used symbols)

α (G) = max {| S | : S \subseteq V, {i, j} \notin E for all i, j \in S, i \neq j}

$α (G)$ is the independence number (maximum size of independent set)

Block Coding Strategy

Transmit blocks of length $k$ : $(i_{1}, \dots, i_{k}) \in V^{k}$

Strong Graph Product

$G^{⊠ k}$ has vertex set $V^{k}$ and edges:

{(i_{1}, \dots, i_{k}), (j_{1}, \dots, j_{k})} \in E (G^{⊠ k})

if for every coordinate $ℓ$ , either $i_{ℓ} = j_{ℓ}$ or ${i_{ℓ}, j_{ℓ}} \in E (G)$

Shannon Capacity (Shannon, 1956)

Θ (G) = lim_{k \to \infty} (α (G^{⊠ k}))^{1 / k}

(The limit exists by Fekete's lemma since ${\log α (G^{⊠ k})}$ is subadditive)

Information Rate: $\log_{2} Θ (G)$ bits per symbol

Alternative form:

C_{0} (G) = lim_{k \to \infty} \frac{1}{k} \log_{2} α (G^{⊠ k})

Famous Example: Pentagon $C_{5}$

$α (C_{5}) = 2$ (can use 2 non-adjacent symbols)
$α (C_{5}^{⊠ 2}) = 5$ (block coding achieves more!)
Lovász proved: $Θ (C_{5}) = \sqrt{5} \approx 2.236$
Open problem: What is $Θ (C_{7})$ ? (Known: $3 \leq Θ (C_{7}) \leq 3.3177 . . .$ )

The Lovász Theta Function

Orthonormal Representation (Lovász, 1979)

A collection $(c, u_{1}, \dots, u_{n})$ of unit vectors in $R^{m}$ is an orthonormal representation of $G$ if:

⟨ u_{i}, u_{j} ⟩ = 0 whenever {i, j} \notin E and i \neq j

Non-adjacent vertices must have orthogonal unit vectors!

Note: All vectors are unit: $‖ c ‖_{2} = ‖ u_{i} ‖_{2} = 1$

Lovász $ϑ$ Function

ϑ (G) := min_{(c, u_{1}, \dots, u_{n})} max_{1 \leq i \leq n} \frac{1}{(⟨ c, u_{i} ⟩)^{2}}

where the minimum is over all orthonormal representations of $G$ .

Alternative formulation: $ϑ (G) = max_{orth. rep.} min_{i} (⟨ c, u_{i} ⟩)^{2}$ is the dual form

Lower Bound

α (G) \leq ϑ (G)

Independent set gives orthogonal vectors

Upper Bound

ϑ (G) \leq χ (\overset{―}{G})

$χ (\overset{―}{G})$ is chromatic number of complement

Computability

Polynomial Time

Via semidefinite programming!

Fundamental Inequalities (Lovász, 1979)

α (G) \leq Θ (G) \leq ϑ (G) \leq χ (\overset{―}{G})

Furthermore, $ϑ$ is multiplicative under strong product: $ϑ (G ⊠ H) = ϑ (G) \cdot ϑ (H)$

This implies $ϑ (G^{⊠ k}) = ϑ (G)^{k}$ , so $Θ (G) \leq ϑ (G)$

Since computing $α (G)$ is NP-hard, $ϑ (G)$ provides a polynomial-time computable upper bound!

SDP Formulation of $ϑ (G)$

Primal SDP via Gram Matrix

Let $u_{0} = c$ and construct the $(n + 1) \times (n + 1)$ Gram matrix:

X = [\begin{matrix} u_{0} \\ u_{1} \\ ⋮ \\ u_{n} \end{matrix}] {[\begin{matrix} u_{0} \\ u_{1} \\ ⋮ \\ u_{n} \end{matrix}]}^{T}

So $X_{i j} = ⟨ u_{i}, u_{j} ⟩$ for $i, j = 0, 1, \dots, n$

Primal SDP:

\begin{aligned} ϑ (G) = max & \sum_{i = 1}^{n} X_{0 i} \\ subject to & X_{00} = 1 \\ X_{i j} = 0 \forall {i, j} \notin E, 1 \leq i < j \leq n \\ tr (X) = 1 + \sum_{i = 1}^{n} X_{i i} = n + 1 (unit vectors) \\ X ⪰ 0 \end{aligned}

Note: The objective $\sum_{i = 1}^{n} X_{0 i} = \sum_{i = 1}^{n} ⟨ c, u_{i} ⟩$ is maximized when all $⟨ c, u_{i} ⟩$ are equal

Constraint Interpretation

$X_{00} = 1$ : Handle $c$ is unit vector
$X_{i j} = 0$ : Orthogonality for non-edges in $G$
$tr (X) = n + 1$ : All $n + 1$ vectors are unit length
$X ⪰ 0$ : Valid Gram matrix structure

Dual SDP (Simplified)

\begin{aligned} ϑ (G) = min & λ \\ subject to & λ I - J - A_{\overset{―}{G}} ⪰ 0 \end{aligned}

where $J$ is the all-ones matrix and $A_{\overset{―}{G}}$ is the adjacency matrix of the complement graph $\overset{―}{G}$

Polynomial-Time Solvability

Both primal and dual satisfy Slater's condition, ensuring:

Optimal values are equal (strong duality)
Optimal solutions exist
Solvable in polynomial time via interior-point methods
Complexity: $O (n^{4.5})$ or better with modern SDP solvers

Worked Example: Pentagon $C_{5}$

Computing $ϑ (C_{5}) = \sqrt{5}$

The pentagon $C_{5}$ has 5 vertices arranged in a cycle. By symmetry, we can construct an optimal solution to the dual SDP:

λ I - J - A_{\overset{―}{G}} ⪰ 0

For $C_{5}$ , the complement $\overset{―}{C_{5}}$ consists of edges between vertices at distance 2 in the cycle.

Dual Certificate

The matrix $λ I - J - A_{\overset{―}{C_{5}}}$ must be PSD. By symmetry and eigenvalue analysis:

All-ones vector has eigenvalue: $λ - 5 - 0 = λ - 5$
Other eigenvectors have eigenvalue: $λ - 0 - φ$ or $λ + φ - 1$

where $φ = \frac{1 + \sqrt{5}}{2}$ is the golden ratio

Optimal Value

For PSD, we need all eigenvalues $\geq 0$ :

λ \geq max {5, 1 - φ} = 5

But by careful analysis of the pentagon's structure:

λ^{*} = \sqrt{5}

Verification via Primal

The primal constructs unit vectors in $R^{3}$ forming a symmetric "umbrella" configuration. The handle vector $c$ and rib vectors $u_{1}, \dots, u_{5}$ satisfy:

All vectors have unit length
Non-adjacent vertices have orthogonal vectors
All inner products $⟨ c, u_{i} ⟩$ are equal to $1 / \sqrt{φ}$

This gives $ϑ (C_{5}) = 5 \cdot (1 / \sqrt{φ}) = \sqrt{5}$

Final Result

Since we also know $α (C_{5}^{⊠ 2}) = 5$ , we have $Θ (C_{5}) \geq \sqrt{5}$

Θ (C_{5}) = ϑ (C_{5}) = \sqrt{5}

This is one of the few cases where Shannon capacity is exactly computable!

Summary: SDP in Combinatorial Optimization

Key Paradigms We've Learned

🔧 Matrix Lifting

Replace variables $x$ with matrices $X = x x^{T}$
Drop non-convex rank constraints
Preserve essential structure via linear constraints
Enables convex relaxation of discrete problems

🎯 Randomized Rounding

Extract discrete solutions from SDP solutions
Hyperplane cuts in geometric representations
Performance guarantees via probabilistic analysis
Often achieves best-known approximation ratios

                        Matrix Norms
                        Spectral norm: ‖M‖∞ via Schur complement
Nuclear norm: ‖M‖1 minimization
Duality: (·)∞ and (·)1 are dual norms
Applications: matrix completion, robust PCA

                    

                        Maximum Cut
                        Graph Laplacian formulation
SDP relaxation via matrix lifting
GW algorithm: 0.878-approximation
Best possible under UGC

                    

                        Shannon Capacity
                        Information-theoretic channel capacity
Lovász ϑ function bounds Θ(G)
Polynomial-time via SDP
Exact for C5: Θ(C5)=5

                    

🚀 Broader Impact & Future Directions

Semidefinite programming has revolutionized approximation algorithms and beyond:

Machine Learning: Matrix completion, robust PCA, metric learning, kernel methods
Control Theory: Linear matrix inequalities (LMIs), robust control, $H_{\infty}$ control
Quantum Information: Entanglement detection, quantum games, separability
Combinatorial Optimization: Graph coloring, MAX-SAT, constraint satisfaction
Sum-of-Squares: Polynomial optimization, moment problems, hierarchies

📚 Key References

Thank You!

Questions and Discussion

Semidefinite Programming and Combinatorial Optimization

AIAA 3225 · Learning and Optimization for Artificial Intelligence

Matrix Norms via SDP

Maximum Cut Problem

Goemans-Williamson

Shannon Capacity

Matrix Norms via Semidefinite Programming

Schatten p-Norms

Nuclear Norm (p=1): ‖A‖1=∑kσk(A)

Frobenius Norm (p=2): ‖A‖F=∑kσk2(A)=∑ijAij2

Spectral Norm (p=∞): ‖A‖∞=σ1(A)

Norm Duality

Spectral Norm via SDP

SDP Formulation via Schur Complement

Nuclear Norm via SDP

SDP Formulation

Schur Complement Interpretation

Dual Problem

Duality Connection

The Maximum-Cut Problem

Problem Setup

Real-World Applications:

MAX-CUT Problem

Simple Example: Triangle Graph K3

Binary Encoding and Graph Laplacian

Binary {−1,+1} Encoding

Key Identity

Graph Laplacian Matrix

Quadratic Reformulation

SDP Relaxation of MAX-CUT

Matrix Lifting Technique

Property 1

Property 2

Property 3

Semidefinite Relaxation

Why It's a Relaxation

Geometric Interpretation

Quick Check: Understanding SDP Relaxations

Question: Why is the SDP relaxation always an upper bound on MAX-CUT?

Key Takeaway

Goemans-Williamson Algorithm

Main Approximation Result (Goemans-Williamson, 1995)

Randomized Hyperplane Rounding Algorithm

Geometric Interpretation

Analysis of Goemans-Williamson

Key Lemma: Expected Correlation

Proof Sketch

Expected Cut Value

Comparison to SDP Value

Critical Approximation Bound

Recap: Part I – Matrix Norms & MAX-CUT

What We've Covered So Far

📊 Matrix Norms via SDP

✂️ Maximum Cut Problem

🔑 Key Techniques

Coming Up Next: Shannon Capacity

Shannon Capacity and Communication

Noisy Communication Model

Block Coding Strategy

Strong Graph Product

Shannon Capacity (Shannon, 1956)

Famous Example: Pentagon C5

The Lovász Theta Function

Orthonormal Representation (Lovász, 1979)

Lovász ϑ Function

Lower Bound

Upper Bound

Computability

Fundamental Inequalities (Lovász, 1979)

SDP Formulation of ϑ(G)

Primal SDP via Gram Matrix

Constraint Interpretation

Dual SDP (Simplified)

Polynomial-Time Solvability

Worked Example: Pentagon C5

Computing ϑ(C5)=5

Dual Certificate

Optimal Value

Verification via Primal

Final Result

Schatten $p$ -Norms

Nuclear Norm ( $p = 1$ ): $‖ A ‖_{1} = \sum_{k} σ_{k} (A)$

Frobenius Norm ( $p = 2$ ): $‖ A ‖_{F} = \sqrt{\sum_{k} σ_{k}^{2} (A)} = \sqrt{\sum_{i j} A_{i j}^{2}}$

Spectral Norm ( $p = \infty$ ): $‖ A ‖_{\infty} = σ_{1} (A)$

Simple Example: Triangle Graph $K_{3}$

Binary ${- 1, + 1}$ Encoding

Famous Example: Pentagon $C_{5}$

Lovász $ϑ$ Function

SDP Formulation of $ϑ (G)$

Worked Example: Pentagon $C_{5}$

Computing $ϑ (C_{5}) = \sqrt{5}$