Semidefinite Programming and Combinatorial Optimization

AIAA 3225 · Learning and Optimization for Artificial Intelligence

Week 6: SDP and Combinatorial Optimization

Instructor: Prof. Xin Wang

Date: October 16, 2025

Matrix Norms via SDP

Nuclear & Spectral Norms

Maximum Cut Problem

Graph Laplacian & Binary Encoding

Goemans-Williamson

0.878-Approximation Algorithm

Shannon Capacity

Lovász Theta Function

Matrix Norms via Semidefinite Programming

Schatten p-Norms

For a matrix ARm×n with singular values σ1(A)σ2(A)0:

Ap:={(k=1min(m,n)σkp(A))1/p,1p<σ1(A),p=

Nuclear Norm (p=1): A1=kσk(A)

Sum of all singular values (also called trace norm)

  • Convex envelope of rank on the unit ball
  • Used in matrix completion and low-rank approximation
  • Sparsity-inducing norm for matrices

Frobenius Norm (p=2): AF=kσk2(A)=ijAij2

Euclidean norm of the matrix entries

  • Also equals tr(ATA)
  • Natural extension of vector 2 norm
  • Easy to compute directly from entries

Spectral Norm (p=): A=σ1(A)

Largest singular value (also called operator norm or 2 operator norm)

  • Equals maxx2=1,y2=1yTAx
  • For symmetric matrices: A=max{|λmax(A)|,|λmin(A)|}
  • Measures maximum stretching factor

Norm Duality

The nuclear and spectral norms are dual to each other:

A=maxY11A,Y,A1=maxY1A,Y

where A,Y=tr(ATY) is the trace inner product

Spectral Norm via SDP

For a symmetric matrix MSn, the spectral norm is:

M=maxx2=1|xTMx|=max{|λmax(M)|,|λmin(M)|}

SDP Formulation via Schur Complement

Consider the block matrix:

X(t)=(tInMMtIn)S2n

Key Insight (Schur Complement): X(t)0 if and only if:

  • tIn0 (always true for t0)
  • tInM(tIn)1M0, which gives t2InM20
  • This holds iff t2InM2, i.e., tM
Primal SDP: minimizetsubject to(tInMMtIn)0

Nuclear Norm via SDP

For a general matrix MRm×n, the nuclear norm M1=kσk(M):

SDP Formulation

Primal SDP: minimize12(tr(U)+tr(V))subject to(UMMTV)0USm,VSn

Correctness: At optimum, by complementarity, U and V are tight, giving U=(MMT)1/2 and V=(MTM)1/2, hence tr(U)+tr(V)=2kσk(M)

Schur Complement Interpretation

The PSD constraint ensures:

  • U,V0
  • UMV1MT0 (when V0)
  • All singular values satisfy σk2(M)λi(U)λj(V)

Dual Problem

With dual variable YRm×n:

maxM,Y

subject to:

(ImYYTIn)0

which is equivalent to Y1

Duality Connection

The unit balls {M1} and {M11} are polar to each other under the trace inner product, reflecting the fundamental duality between spectral and nuclear norms.

The Maximum-Cut Problem

Problem Setup

Weighted Undirected Graph: G=(V,E,w) where:

  • V={1,,n} is the vertex set
  • E{{i,j}:i,jV,ij} is the edge set
  • w:ER>0 assigns positive weights wij>0
Cut Value: val(S,Sc)=iS,jSc{i,j}Ewij

Real-World Applications:

  • Social Networks: Community detection and clustering
  • VLSI Design: Circuit partitioning for chip layout
  • Statistical Physics: Ising spin glass ground states
  • Machine Learning: Feature selection and clustering
  • Image Segmentation: Computer vision applications

MAX-CUT Problem

maximizeSViS,jSc{i,j}Ewij

This is NP-hard!

No polynomial-time exact algorithm unless P = NP (Karp, 1972)

Simple Example: Triangle Graph K3

Complete graph on 3 vertices with unit weights:

  • Partition (+,+,+) or (,,): Cut value = 0 (all on same side)
  • Partition (+,+,) and permutations: Cut value = 2 (two edges cut)

Optimal value: 2 (any balanced partition)

Binary Encoding and Graph Laplacian

Binary {1,+1} Encoding

Associate to every cut (S,Sc) the sign vector x{1,+1}n:

xi={+1if iS1if iSc

Key Identity

For any edge {i,j}E:

1xixj2={1if xixj (edge is cut)0if xi=xj (edge not cut)

Since (1xixj)/2=(xixj)2/4 when xi,xj{1,+1}

Graph Laplacian Matrix

Define LGSn such that:

xTLGx={i,j}Ewij(xixj)2xRn

Explicit form:

(LG)ij={k:{i,k}Ewikif i=j (weighted degree)wijif {i,j}E0otherwise

Note: LG is positive semidefinite with smallest eigenvalue 0

Quadratic Reformulation

Step 1: Cut value in terms of indicator

val(S,Sc)={i,j}Ewij1xixj2

Step 2: Convert to quadratic form

=14{i,j}Ewij(xixj)2

Step 3: Express using Laplacian

=14xTLGx

Conclusion: MAX-CUT maximizex{1,1}n14xTLGx

SDP Relaxation of MAX-CUT

Matrix Lifting Technique

Given feasible x{1,1}n, define X:=xxTSn

Property 1

X0

(PSD as Gram matrix)

Property 2

rank(X)=1

(Rank-one structure)

Property 3

Xii=1

(Since xi2=1)

Key observation: xTLGx=ij(LG)ijxixj=ij(LG)ijXij=tr(LGX)

Semidefinite Relaxation

Drop the non-convex rank-one constraint:

maximize14tr(LGX)subject toX0Xii=1,i=1,,n

Alternative formulation using tr(LGX)=12{i,j}Ewij(Xii+Xjj2Xij):

maximize14{i,j}Ewij(1Xij)subject toX0,Xii=1i

Why It's a Relaxation

  • Feasibility: Every x{1,1}n gives feasible X=xxT
  • Objective: Same value tr(LGX)=xTLGx
  • Relaxation: Larger feasible set (dropped rank constraint)
  • Result: SDP optimum true MAX-CUT value

Geometric Interpretation

Cut Polytope: conv{xxT:x{1,1}n}

Elliptope: En={X0:Xii=1,i=1,,n}

We have: Cut Polytope Elliptope

SDP optimizes over the larger elliptope

Fundamental Question: How close is the SDP optimum to the true MAX-CUT value?

Quick Check: Understanding SDP Relaxations

Question: Why is the SDP relaxation always an upper bound on MAX-CUT?

A. Because SDP solvers use approximation algorithms
B. Because the Laplacian matrix is positive semidefinite
C. Because we dropped the rank-1 constraint, enlarging the feasible set
D. Because Goemans-Williamson proved it in 1995
✓ Correct!
The SDP feasible region (elliptope) contains all rank-1 matrices xxT from the original problem, but also includes higher-rank PSD matrices. Since we're maximizing, a larger feasible set can only give an objective value ≥ the original optimum. This is the fundamental principle of convex relaxation.

Key Takeaway

Every convex relaxation of a discrete optimization problem provides an upper bound (for maximization) or lower bound (for minimization) on the optimal value. The quality of this bound depends on how tight the relaxation is.

Goemans-Williamson Algorithm

Main Approximation Result (Goemans-Williamson, 1995)

αGWpSDPvMAX-CUTpSDP

where

αGW=min0θπθ/π(1cosθ)/20.87856

The minimum is achieved at θ2.331 radians (equivalently t=cosθ0.689)

This is the best possible for polynomial-time algorithms assuming the Unique Games Conjecture (Khot et al., 2007)!

Randomized Hyperplane Rounding Algorithm

Input: Optimal SDP solution X with Cholesky factorization X=VVT

where VRn×r (with r=rank(X)) has rows v1,,vn that are unit vectors: vi2=1

Algorithm Steps:
  1. Draw random Gaussian vector gN(0,Ir)
  2. For each i=1,,n: Set xi:=sign(vi,g) where sign(0)=+1
  3. Define cut: S={i:xi=+1}, Sc={i:xi=1}
  4. Output cut (S,Sc) with value 14xTLGx

Geometric Interpretation

The algorithm cuts the unit vectors {vi}i=1nRr by a random hyperplane through the origin with normal direction g. Vertices on the same side of the hyperplane are placed in the same partition.

Hyperplane: {uRr:g,u=0}

Analysis of Goemans-Williamson

Key Lemma: Expected Correlation

For the rounding xi=sign(vi,g) where gN(0,Ir):

E[xixj]=12πarccos(vi,vj)=12θijπ

where θij=arccos(vi,vj)[0,π] is the angle between vi and vj

Proof Sketch

Let θ=arccos(vi,vj) be the angle between vi and vj.

Key insight: xixj=1 (different signs) occurs when g falls in a cone of angular width θ on each side.

By rotational symmetry of the Gaussian:

P[sign(vi,g)sign(vj,g)]=θπ

Therefore:

E[xixj]=1(1θ/π)+(1)θ/π=12θ/π

Expected Cut Value

Using the identity val(S,Sc)=14xTLGx:

E[val(S,Sc)]=14{i,j}Ewij(1E[xixj])
=14{i,j}Ewij2θijπ

Comparison to SDP Value

The SDP optimal value is:

pSDP=14{i,j}Ewij(1Xij)

Since Xij=vi,vj=cosθij:

pSDP=14{i,j}Ewij(1cosθij)

Critical Approximation Bound

Define α(θ):=θ/π(1cosθ)/2 for θ(0,π]

Then: 2θπαGW(1cosθ) where αGW=minθ(0,π]α(θ)0.87856

Therefore:

E[val(S,Sc)]αGW14{i,j}Ewij(1Xij)=αGWpSDP

Recap: Part I – Matrix Norms & MAX-CUT

What We've Covered So Far

📊 Matrix Norms via SDP

  • Nuclear norm ·1 and spectral norm ·
  • Dual relationship via trace inner product
  • SDP formulations using Schur complements
  • Applications to matrix completion

✂️ Maximum Cut Problem

  • NP-hard combinatorial optimization
  • Quadratic formulation via graph Laplacian
  • SDP relaxation through matrix lifting
  • 0.878-approximation via randomized rounding

🔑 Key Techniques

  • Convex Relaxation: Drop non-convex constraints (e.g., rank-1) to get tractable problems
  • Schur Complement: Essential tool for converting matrix inequalities to SDP form
  • Randomized Rounding: Extract discrete solutions from continuous SDP solutions
  • Probabilistic Analysis: Prove approximation guarantees in expectation

Coming Up Next: Shannon Capacity

We'll explore how SDP techniques apply to information theory and graph coloring problems through the Lovász theta function.

Shannon Capacity and Communication

Noisy Communication Model

Setting: Transmit symbols from alphabet {1,,n} over a noisy channel

Confusability Graph G=(V,E):

  • Vertices V={1,2,,n} represent symbols
  • Edge {i,j}E means symbols i,j may be confused by receiver
  • Safe transmission requires independent set (no edges between used symbols)
α(G)=max{|S|:SV,{i,j}E for all i,jS,ij}

α(G) is the independence number (maximum size of independent set)

Block Coding Strategy

Transmit blocks of length k: (i1,,ik)Vk

Strong Graph Product

Gk has vertex set Vk and edges:

{(i1,,ik),(j1,,jk)}E(Gk)

if for every coordinate , either i=j or {i,j}E(G)

Shannon Capacity (Shannon, 1956)

Θ(G)=limk(α(Gk))1/k

(The limit exists by Fekete's lemma since {logα(Gk)} is subadditive)

Information Rate: log2Θ(G) bits per symbol

Alternative form:

C0(G)=limk1klog2α(Gk)

Famous Example: Pentagon C5

  • α(C5)=2 (can use 2 non-adjacent symbols)
  • α(C52)=5 (block coding achieves more!)
  • Lovász proved: Θ(C5)=52.236
  • Open problem: What is Θ(C7)? (Known: 3Θ(C7)3.3177...)

The Lovász Theta Function

Orthonormal Representation (Lovász, 1979)

A collection (c,u1,,un) of unit vectors in Rm is an orthonormal representation of G if:

ui,uj=0whenever {i,j}E and ij

Non-adjacent vertices must have orthogonal unit vectors!

Note: All vectors are unit: c2=ui2=1

Lovász ϑ Function

ϑ(G):=min(c,u1,,un)max1in1(c,ui)2

where the minimum is over all orthonormal representations of G.

Alternative formulation: ϑ(G)=maxorth. rep.mini(c,ui)2 is the dual form

Lower Bound

α(G)ϑ(G)

Independent set gives orthogonal vectors

Upper Bound

ϑ(G)χ(G)

χ(G) is chromatic number of complement

Computability

Polynomial Time

Via semidefinite programming!

Fundamental Inequalities (Lovász, 1979)

α(G)Θ(G)ϑ(G)χ(G)

Furthermore, ϑ is multiplicative under strong product: ϑ(GH)=ϑ(G)ϑ(H)

This implies ϑ(Gk)=ϑ(G)k, so Θ(G)ϑ(G)

Since computing α(G) is NP-hard, ϑ(G) provides a polynomial-time computable upper bound!

SDP Formulation of ϑ(G)

Primal SDP via Gram Matrix

Let u0=c and construct the (n+1)×(n+1) Gram matrix:

X=[u0u1un][u0u1un]T

So Xij=ui,uj for i,j=0,1,,n

Primal SDP: ϑ(G)=maxi=1nX0isubject toX00=1Xij=0{i,j}E,1i<jntr(X)=1+i=1nXii=n+1 (unit vectors)X0

Note: The objective i=1nX0i=i=1nc,ui is maximized when all c,ui are equal

Constraint Interpretation

  • X00=1: Handle c is unit vector
  • Xij=0: Orthogonality for non-edges in G
  • tr(X)=n+1: All n+1 vectors are unit length
  • X0: Valid Gram matrix structure

Dual SDP (Simplified)

ϑ(G)=minλsubject toλIJAG0

where J is the all-ones matrix and AG is the adjacency matrix of the complement graph G

Polynomial-Time Solvability

Both primal and dual satisfy Slater's condition, ensuring:

  • Optimal values are equal (strong duality)
  • Optimal solutions exist
  • Solvable in polynomial time via interior-point methods
  • Complexity: O(n4.5) or better with modern SDP solvers

Worked Example: Pentagon C5

Computing ϑ(C5)=5

The pentagon C5 has 5 vertices arranged in a cycle. By symmetry, we can construct an optimal solution to the dual SDP:

λIJAG0

For C5, the complement C5 consists of edges between vertices at distance 2 in the cycle.

Dual Certificate

The matrix λIJAC5 must be PSD. By symmetry and eigenvalue analysis:

  • All-ones vector has eigenvalue: λ50=λ5
  • Other eigenvectors have eigenvalue: λ0φ or λ+φ1

where φ=1+52 is the golden ratio

Optimal Value

For PSD, we need all eigenvalues 0:

λmax{5,1φ}=5

But by careful analysis of the pentagon's structure:

λ=5

Verification via Primal

The primal constructs unit vectors in R3 forming a symmetric "umbrella" configuration. The handle vector c and rib vectors u1,,u5 satisfy:

  • All vectors have unit length
  • Non-adjacent vertices have orthogonal vectors
  • All inner products c,ui are equal to 1/φ

This gives ϑ(C5)=5(1/φ)=5

Final Result

Since we also know α(C52)=5, we have Θ(C5)5

Θ(C5)=ϑ(C5)=5

This is one of the few cases where Shannon capacity is exactly computable!

Summary: SDP in Combinatorial Optimization

Key Paradigms We've Learned

🔧 Matrix Lifting

  • Replace variables x with matrices X=xxT
  • Drop non-convex rank constraints
  • Preserve essential structure via linear constraints
  • Enables convex relaxation of discrete problems

🎯 Randomized Rounding

  • Extract discrete solutions from SDP solutions
  • Hyperplane cuts in geometric representations
  • Performance guarantees via probabilistic analysis
  • Often achieves best-known approximation ratios

Matrix Norms

  • Spectral norm: M via Schur complement
  • Nuclear norm: M1 minimization
  • Duality: (·) and (·)1 are dual norms
  • Applications: matrix completion, robust PCA

Maximum Cut

  • Graph Laplacian formulation
  • SDP relaxation via matrix lifting
  • GW algorithm: 0.878-approximation
  • Best possible under UGC

Shannon Capacity

  • Information-theoretic channel capacity
  • Lovász ϑ function bounds Θ(G)
  • Polynomial-time via SDP
  • Exact for C5: Θ(C5)=5

🚀 Broader Impact & Future Directions

Semidefinite programming has revolutionized approximation algorithms and beyond:

  • Machine Learning: Matrix completion, robust PCA, metric learning, kernel methods
  • Control Theory: Linear matrix inequalities (LMIs), robust control, H control
  • Quantum Information: Entanglement detection, quantum games, separability
  • Combinatorial Optimization: Graph coloring, MAX-SAT, constraint satisfaction
  • Sum-of-Squares: Polynomial optimization, moment problems, hierarchies

Thank You!

Questions and Discussion