*Here is the second guest post from ICALP 2014 kindly written by Clément Canonne. Enjoy it! (This post is also available in PDF. Any typo or typesetting issue with the HTML version is my responsibility.)*

#### Fast Algorithms for Constructing Maximum Entropy Summary Trees (Howard Karloff)

This talk [1] revolved around a question. A simple question, really.*“How to draw a rooted tree?”*Not

*any*tree, though: a big one. A huge, gigantic n-node tree T, with n » ⌊2

^{22.891112}⌋ [2], on a tiny sheet of paper (or a screen, for that matter). Most of the time, the approach is either to

- draw the whole thing, get a binocular and hope something can be seen at all;
- go for interaction, and draw/zoom on parts of the graph on-the-fly.

^{′}, the “best summary” of T. (At this point – a demo! From the Mathematical Genealogy Project: the tree being made of the academic descendants of a obscure mathematician, someone named Gauss. That’s a hell of a tree.)

The key of their approach is

*not*to draw only k carefully chosen nodes of T, and dismiss the others; but instead to allow T

^{′}to combine subsets of T’s nodes into a big, meta-node named “others” (†).

Sure. But what is a “good summary” anyway? Quite naturally, the authors went for entropy – a tree is

*good*if, assigning weights to nodes in the straightforward fashion (for v ∈ T

^{′}, ω

_{v}∝ the total weight of original nodes of T in the subtree of T

^{′}rooted at v), the entropy

is big. That is, if T

^{′}is as balanced as possible, or phrased differently remains very

*informative*.

So, what’s the problem? The problem is exactly (†), when it comes to computing such an optimal k-node summary tree. Indeed, if we allow an algorithm to combine an arbitrary subset of a node v into a metanode others

_{v}(and we do!), there is the slight problem of having exponentially many possibilities – and that could turn out pretty bad. An intuitive idea [3] would be to sort v’s children by non-decreasing weights, and combine together a

*prefix*(the first, say, ℓ of them) – they shouldn’t be too important anyway. Sadly, it does not work – there are counter examples, even for n = 7 and k = 4.

In a previous work [4], Kenneth Shirley and Howard Karloff managed to give a

*pseudo*polynomial algorithm (polynomial in the total weight), and a polynomial-time (additive) approximation one. This works came for the kill, removing the “pseudo” to give

- a O(k
^{2}n + nlog n)-time algorithm for computing an optimal k-node summary tree; - a (better) O(n + poly(k,1∕ε))-time approximation algorithm

*extended prefixes*do! Instead of (after sorting a node v’s children) merging the nodes {1,…,ℓ}, it is sufficient to merge the {1,…,ℓ,ℓ

^{′}} for some ℓ

^{′}≥ ℓ + 1 (yes, it is surprising to me too).

And

*voilà*! Polynomial time.

#### Testing Equivalence of Polynomials under Shifts (Rafael Mendes de Oliveira)

And now, some testing [5]! Rafael has a device: a blackbox computing a n-variate polynomial P ∈ F[X_{1},…,X

_{n}] (for some fixed field F). And another one – computing Q ∈ F[X

_{1},…,X

_{n}]. On inputs x ∈ F

^{n}, these two devices allow him to compute P(x), Q(x) in constant time.

Rafael also has two coauthors, Zeev Dvir and Amir Shpilka, and a question:

**Question**(Shift Equivalence Testing (SET)). Given a bound on their degree, decide if P and Q are

*morally*the same polynomial. That is, if there is a shift vector a such that P(⋅ + a) = Q(⋅) (and if so, please, find one.)

So SET must be difficult, since it not only generalizes the question, but also asks for an explicit witness – this vector a. After giving some background on what is known about related questions (is P “shift-equivalent” to a polynomial that can be represented with very few non-zero coefficients, for instance), Rafael gave the main result of their paper:

**Theorem.**There is an efficient (randomized) algorithm for SET. Furthermore, the only randomized part comes from solving instances of PIT.

Roughly speaking, the idea of the proof is to try to find a “one step at a time”, by successively coming up with candidates a

_{d}, a

_{d-1}, …, a

_{1}. How? Find a

_{d}that matches the degree-d part of P(⋅ + a) and Q(⋅) – this is just a linear system to solve, along with a call to a randomized subroutine for PIT. Then find a

_{d-1}that matches both the degree-d and degree-(d - 1) parts of P(⋅ + a) and Q(⋅) – this no longer linear, alas. But, and that’s the trick, one can plug the known value of a

_{d}in the quadratic system to make it collapse to a linear system, which has the same solutions as the quadratic one!

And so on, and so forth: after computing a

_{k+1}, one just has to solve a linear system to get a

_{k}, using the nice properties of a

_{k+1}(simplifications “trickles down”), before calling PIT to make sure the result is correct so far. And at the end of the day, either a

_{1}is a good shift-vector, or there is none – and a final call the PIT subroutine lets us know which of the two cases hold.

Hop.

- [1]
*Fast Algorithms for Constructing Maximum Entropy Summary Trees*, R. Cole and H. Karloff. ICALP, 2014. - [2] This is a prime number. Check it out.
- [3] Rule of thumb: when I write “intuitive”, it is a trap.
- [4]
*Maximum Entropy Summary Trees*, K. Shirley and H. Karloff, Computer Graphics Forum. 2013. - [5]
*Testing Equivalence of Polynomials under Shifts*, Z. Dvir, R.M de Oliveira, and A. Shpilka. ICALP, 2014.

## No comments:

Post a Comment