Process Algebra Diary: July 2014

Sunday, July 13, 2014

Day 4 and Recap of ICALP 2014 (guest post by Andrew Winslow)

Here is the last guest post by Andrew Winslow from ICALP 2014. Enjoy it! Thanks to Andrew for his conference reports.

This [Friday, 11 July 2014] is the last day of ICALP!

George Mertzios, Sotiris Nikoletseas, Christoforos Raptopoulos and Paul Spirakis. Determining Majority in Networks with Local Interactions and very Small Local Memory

This talk focused on a problem in distributed computing on a graph with stateful nodes. Given a graph with nodes colored red and blue, a population protocol is a set of rules that define how the colors of a pair of endpoints of an edge can be transformed. For instance, a population protocol might include a rule to turn the colors of a pair of red nodes sharing to both be blue. One is also allowed to add additional colors not found in inputs, and the total number of colors that appear in the input and population protocol is the number of states of the protocol.
A population protocol for the majority problem transforms the colors of all (or all but a (1-ε) fraction) of the nodes into the color appearing most frequently in the input graph. In may be possible at any point during the execution of the protocol that more than one rule can be applied to the endpoints of more than one edge, and it is assumed that either the rules are chosen randomly (a randomized scheduler) or advesarially subject to some basic requirements, like that any rule that can be applied is applied after a finite amount of time (a fair scheduler). One can also ask about the amount of time (i.e. the number of rule applications) needed for a majority to be reached.

It was previously known that a 3-state protocol for majority exists in the case that both the graph is a clique and the ratio of the occurrence counts of most common color over least common color is ω(sqrt(n)log(n))[1]. In this work, Mertzios, Nikoletseas, Christoforos, and Spirakis constructed a four-state protocol that correct computes majority with probability 1 on any graph and with any difference between the number of occurrences of the two colors, assuming a fair scheduler. They also prove that the 3-state protocol described in [1] is strictly worse, failing to converge to the correct majority with high probability for an infinite family of graphs (graphs consisting of a clique at the end of a chain), even with a probabilistic scheduler.

[1] D. Angluin, J. Aspnes, and D. Eisenstat. A simple population protocol for fast robust approximate majority. Distributed Computing, 21(2): 87-102, 2008.

Karl Bringmann, Fabian Kuhn, Konstantinos Panagiotou, Ueli Peter and Henning Thomas. Internal DLA: Efficient Simulation of a Physical Growth Model.

Karl gave the talk. This work is about improving the speed of sampling the results of the following process, called internal diffuse limited aggregation or IDLA:

Place a particle at origin in the square lattice.
Take a random walk with this particle until a lattice location not containing another particle is found.
Place this particle at the empty location.

As one might expect, the resulting shape of the particle locations is usually roundish. The roundness can be captured by considering the ratio of the radii of the smallest containing and largest contained circles for the shape, centered at the origin. It is known that the expected difference in the ratios of these circles is O(log(n)), i.e. that the shape is usually quite round.
The IDLA process comes from statistical physics, and natural problems in this area related to understanding the distribution of shapes generated by this process. The naive approach to randomly sample the result of the IDLA process takes O(n^2) time, since the expected radius is Theta(n^0.5), a randomly walk from the origin takes Theta(n) expected time to reach an empty location (usually on the boundary), and n particles must be simulated.
Bringmann, Kuhn, Panagiotou, Peter, and Thomas improve this to O(nlog^2(n)) expected time and O(n^0.5*log(n)) expected space with high probability. The idea of the algorithm is to efficiently carry out the random walk by making large jumps, carrying out a significant portion of the walk in O(1) expected time. An upper bound on the number of jumps can then be obtained by computing the time expected to leave the region containing previous points (with radius O(sqrt(n)*log(n))).

Recap

This was my first time attending ICALP, and overall had a great experience. The conference was well-organized and the work presented was consistently very high quality and also had a great diversity to it; every session had a fairly distinct theme. One of the more unique aspects of the conference are the three topical tracks, and the mixing and interaction between "Track-A people" and "Track-B people" (and don't forget Track-C people...) led to some great discussions.

Friday, July 11, 2014

July 9, 2014 - July 10, 2014 ICALP (guest post by Clément Canonne)

Here is the second guest post from ICALP 2014 kindly written by Clément Canonne. Enjoy it! (This post is also available in PDF. Any typo or typesetting issue with the HTML version is my responsibility.)

Fast Algorithms for Constructing Maximum Entropy Summary Trees (Howard Karloff)

This talk [1] revolved around a question. A simple question, really. “How to draw a rooted tree?” Not any tree, though: a big one. A huge, gigantic n-node tree T, with n » ⌊2^22.891112⌋ [2], on a tiny sheet of paper (or a screen, for that matter). Most of the time, the approach is either to

draw the whole thing, get a binocular and hope something can be seen at all;
go for interaction, and draw/zoom on parts of the graph on-the-fly.

Here, none of this: T is T, magnificent n-node weighted tree; and we are given a parameter k ≥ n, to be though of as k « n; the task is to draw a k-node tree T^′, the “best summary” of T. (At this point – a demo! From the Mathematical Genealogy Project: the tree being made of the academic descendants of a obscure mathematician, someone named Gauss. That’s a hell of a tree.)
The key of their approach is not to draw only k carefully chosen nodes of T, and dismiss the others; but instead to allow T^′ to combine subsets of T’s nodes into a big, meta-node named “others” (†).
Sure. But what is a “good summary” anyway? Quite naturally, the authors went for entropy – a tree is good if, assigning weights to nodes in the straightforward fashion (for v ∈ T^′, ω_v ∝ the total weight of original nodes of T in the subtree of T^′ rooted at v), the entropy

is big. That is, if T^′ is as balanced as possible, or phrased differently remains very informative.

So, what’s the problem? The problem is exactly (†), when it comes to computing such an optimal k-node summary tree. Indeed, if we allow an algorithm to combine an arbitrary subset of a node v into a metanode others_v (and we do!), there is the slight problem of having exponentially many possibilities – and that could turn out pretty bad. An intuitive idea [3] would be to sort v’s children by non-decreasing weights, and combine together a prefix (the first, say, ℓ of them) – they shouldn’t be too important anyway. Sadly, it does not work – there are counter examples, even for n = 7 and k = 4.

In a previous work [4], Kenneth Shirley and Howard Karloff managed to give a pseudopolynomial algorithm (polynomial in the total weight), and a polynomial-time (additive) approximation one. This works came for the kill, removing the “pseudo” to give

a O(k²n + nlog n)-time algorithm for computing an optimal k-node summary tree;
a (better) O(n + poly(k,1∕ε))-time approximation algorithm

as well as a faster (albeit non-provably optimal) greedy algorithm. The crux of the approach? Prefixes don’t work, alright. But extended prefixes do! Instead of (after sorting a node v’s children) merging the nodes {1,…,ℓ}, it is sufficient to merge the {1,…,ℓ,ℓ^′} for some ℓ^′ ≥ ℓ + 1 (yes, it is surprising to me too).
And voilà! Polynomial time.

Testing Equivalence of Polynomials under Shifts (Rafael Mendes de Oliveira)

And now, some testing [5]! Rafael has a device: a blackbox computing a n-variate polynomial P ∈ F[X₁,…,X_n] (for some fixed field F). And another one – computing Q ∈ F[X₁,…,X_n]. On inputs x ∈ Fⁿ, these two devices allow him to compute P(x), Q(x) in constant time.

Rafael also has two coauthors, Zeev Dvir and Amir Shpilka, and a question:

Question (Shift Equivalence Testing (SET)). Given a bound on their degree, decide if P and Q are morally the same polynomial. That is, if there is a shift vector a such that P(⋅ + a) = Q(⋅) (and if so, please, find one.)

This is arguably a natural generalization of the infamous Polynomial Identity Testing (PIT) problem, where the goal is to decide whether P is the all-zero polynomial or not. PIT does not have any known efficient deterministic algorithm, but a very simple polynomial-time randomized one; and the task of improving this state of affairs has ramifications in many areas.
So SET must be difficult, since it not only generalizes the question, but also asks for an explicit witness – this vector a. After giving some background on what is known about related questions (is P “shift-equivalent” to a polynomial that can be represented with very few non-zero coefficients, for instance), Rafael gave the main result of their paper:

Theorem. There is an efficient (randomized) algorithm for SET. Furthermore, the only randomized part comes from solving instances of PIT.

In other words, in spite of what it seems, SET is not much harder than PIT; and any improvement on the use of randomness for PIT directly translates to the corresponding improvement for SET!
Roughly speaking, the idea of the proof is to try to find a “one step at a time”, by successively coming up with candidates a_d, a_d-1, …, a₁. How? Find a_d that matches the degree-d part of P(⋅ + a) and Q(⋅) – this is just a linear system to solve, along with a call to a randomized subroutine for PIT. Then find a_d-1 that matches both the degree-d and degree-(d - 1) parts of P(⋅ + a) and Q(⋅) – this no longer linear, alas. But, and that’s the trick, one can plug the known value of a_d in the quadratic system to make it collapse to a linear system, which has the same solutions as the quadratic one!
And so on, and so forth: after computing a_k+1, one just has to solve a linear system to get a_k, using the nice properties of a_k+1 (simplifications “trickles down”), before calling PIT to make sure the result is correct so far. And at the end of the day, either a₁ is a good shift-vector, or there is none – and a final call the PIT subroutine lets us know which of the two cases hold.
Hop.

[1] Fast Algorithms for Constructing Maximum Entropy Summary Trees, R. Cole and H. Karloff. ICALP, 2014.
[2] This is a prime number. Check it out.
[3] Rule of thumb: when I write “intuitive”, it is a trap.
[4] Maximum Entropy Summary Trees, K. Shirley and H. Karloff, Computer Graphics Forum. 2013.
[5] Testing Equivalence of Polynomials under Shifts, Z. Dvir, R.M de Oliveira, and A. Shpilka. ICALP, 2014.

Thursday, July 10, 2014

Day 3 of ICALP 2014 (guest post by Andrew Winslow)

Here is the third guest post by Andrew Winslow from ICALP 2014. Enjoy it!

Invited Talk: Sanjeev Arora

Sanjeev gave a survey of recent work by himself and others in working to develop provable bounds on the performance of many problems across unsupervised learning. The talk itself was entitled "Overcoming Intractability for Unsupervised Learning", where the "unsupervised" refers (loosely) to being given a pile of raw data and asked to extra some relevant statistical data, e.g. given a collection of newspaper articles, what topics do they cover. The foothold for these problems is assuming that the data comes from some (unknown) distribution, and many fo the problems relate to learning something about the distribution.

Almost all of the problems covered are NP-hard, and the first part of the talk was an explanation of how this is overcome. In Sanjeev's words, "NP-hardness is not the right way to think about this". He explained that the starting point is determining if the set of interesting/realistic/natural instances lie in a tractable subset by looking for realistic assumptions that lead to tractable instances.
As an example, he described the topic modelling problem, where one is given a corpus of text (e.g. all New York Times articles) and the goal is to describe each article as a linear combination of topics, where the topics are not given in advance (i.e. unsupervised learning). A geometric view of this problem is being given a set of points in a convex polytope (articles) and being asked to find the vertices of the polytope (topics). The problem is NP-hard in general, but can be made tractable by assuming that the data is separable: the vertices are included in the set of points given. Then a polynomial time algorithm have be obtained: for each point, test if it's contained on the convex hull, if not, it can be removed, and otherwise it must be a vertex.

He also spoke about making process on other machine learning techniques, including dictionary learning (given complex objects built from a small, simple dictionary, compute the dictionary) and approaches like deep learning (many-level neutral networks). Dictionary learning and deep learning are both thought to take place in the human visual cortex, giving good evidence that their worst-case intractability is not a barrier to their effective use. For the case of deep learning, Sanjeev pointed out that learning the network is effectively trying to learn a TC^0 (ouch!) and yet they have been used successfully by machine learning people to achieve high accuracy rates on large corpora.

Amir Abboud, Virginia Vassilevska Williams and Oren Weimann. Consequences of Faster Alignment of Sequences.

Amir gave this talk on conditional lower bounds on local matching a pair of strings: given a function defining a score for every pair of symbols, find a pair substrings from the input string that maximizes the score. Like other string problems, including longest common subsequence, the best known algorithm for local alignment takes quadratic time and has resisted improvements in the exponent to, say, O(n^1.99).
Amir was quick to point out local alignment is hugely important in genomic sequence analysis (heuristics like BLAST are hugely popular) and that quadratic is too slow in these cases, since n can be several billion. Abboud, Williams, and Weimann prove that a O(n^{2-ε})-time algorithm for local alignment implies the following results:

3SUM solved in O(n^{2-δ}) time for some δ, refuting the weak form of the 3SUM conjecture.
CNF-SAT solvable in O((2-δ)^n) time, refuting the strong exponential time hypothesis.
Max-4-Clique solvable in O(n^{4-δ}) time for some δ, resolving a long-standing opening problem of improving the best-known algorithm for Max-4-Clique.

They also prove similar results for other problems, including edit distance with gap penalties, and longest commen subsequence with wildcards (input strings are allowed to have wildcard characters that match any character). The proofs follow the usual approach of using a O(n^{2-ε})-time local alignment algorithm to produce faster algorithms for the other problems and those that Amir showed were very clean.

Amihood Amir, Timothy M. Chan, Moshe Lewenstein and Noa Lewenstein. On Hardness of Jumbled Indexing.

This work has a similar flavor to the Abboud, Williams, Weimann paper, using the 3SUM conjecture to develop lower bounds for string problems. Unlike that paper, this work gives lower bounds for data structures answering queries. The main problem considered is of jumbled, a.k.a. histogram indexing: given a string T with alphabet of size k, preprocess T to answer queries of the form "what substrings of T have character occurrence counts ", i.e. what substrings have the given histogram. For many pattern matching problems on strings, suffix trees/arrays are used to achieve linear-time queries.
For histogram indexing, naive approaches yield either O(1) preprocessing and O(n) query time (scanning the string using a window whose size is the sum of the entries in the histogram) or O(kn^2) preprocessing time and O(1) query time (precompute the histograms of all substrings and look up the answer). The contribution of Amir, Chan, Lewenstein, and Lewenstein is to prove that, assuming the 3SUM conjecture, these results are the best possible.
For a fixed alphabet of size k at least 3, they prove that achieving O(n^{2-2/k}) preprocessing and O(n^{1-1/k}) query time simultaneously is 3SUM-hard. For alphabets of non-constant size, they prove that achieving O(n^{2-ε}) preprocessing and O(n^{1-δ}) query time is 3SUM-hard. He also mentioned that unpublished work with Timothy Chan achieves a positive result in the form of a data structure that achieves O(n^{2 - 2/k + O(1)}) preprocessing and O(n^{1 - 1/k + O(1)}) query time simultaneously for fixed alphabets.
Moshe did a good job of mixing the problems, motivation, and technical details in a way that kept the talk interesting. Related to the reductions in this work, which use a reduction from convolution 3SUM (testing whether there exist two indices i, j, of the input array such that A[i] + A[j] = A[i+j]), he also recommended that people read Patrascu's paper that proves this special form of 3SUM is 3SUM-hard.

John Iacono and Özgür Özkan. Why Some Heaps Support Constant-Amortized-Time Decrease-Key Operations, and Others Do Not

John's hand-drawn slides caught the attention of some people from the very beginning, and combined with his deadpan delivery and promise to "keep it light" left the audience quiet chuckling through the talk. The talk started with a chat about heaps. All good heaps have O(log(n))-time insert and extract-min operations. Decrease-key is also an operation of heaps; some heaps are better at it. There are three families of heaps with three different running times for decrease-key:

Fibonacci and "improvements" that achieve O(1) time.
Pairing heaps that achieve O(2^{2^{sqrt{loglog(n)}}}) time.
Elmasry/sort heaps that achieve O(loglog(n)) time.

All decrease-key operations of these trees pair small trees into bigger trees. What distingushes them is how they do this:

Pairing heaps: repeatedly pair trees recursively. Repeat until one tree remains.
Elmasry/sort heaps: break roots into blocks of size log(n), chain all roots of a block together.
For Fibonacci heaps: pair heaps whose roots have the same number of children.

Note that the Fibonacci pair based on tree size, not key value, and use an augmentation of loglog(n) bits in each root. Seems like you augmented bits = fast? Fredman considered this in 1999, proving O(1) decrease-key implies Omega(loglog(n)) bits for some classes of heaps, namely pairing heaps. But lower bound requires assumption that every time a pair of roots have their values compared, their heaps are combined.
This contribution of Iacono and Özkan is a new lower bound without such the requirement that trees are combined when compared, but with a new restriction: must pay pointer model costs. This result yields lower bound of Omega(loglog(n)) for pairing and sort heaps, but says nothing about Fibonacci heaps because they cheat: they do not pay pointer model costs. A pointer model version would build a tree above the array, which would have height loglog(n) and give decrease-key a similar running time. The actual proof is adversarial, and considers a whole family of distinct Fibonacci heaps that must grow larger than the total number of possible heaps, a contradiction.

ICALP Day 2 (guest post by Andrew Winslow)

Andrew Winslow was one of the colleagues who kindly answered my call for guest bloggers from ICALP 2014. Here is guest post on the second conference day. Enjoy it!

Invited Talk: Victor Kuncak

Victor gave a talk entitled "Verifying and Synthesizing Software with Recursive Functions" about work with his students, including Ruzica Piskac, Phillipe Suter, Eva Darulova, Ettienne Kneuss, and Andrej Spielmann. The work focused on a number of problems in the area of turning a specification into a program that meets it. He did a great job of breaking out four types of problems they're working on:

	Run-time	Compile-time
Have specification C and program P	Assertion checking (Check P meets C)	Prove correctness (Prove P always meets C)
Have specification C	Constraint programming (Carry out execution according to C)	Program synthesis (Create P meeting C)

Their work is done in Scala and implemented in the Leon system. Out of the four types of problems, Victor indicated that the first two (first row of the table) are easier. Of course, easier doesn't mean easy, and he briefly covered a few of the ideas (induction, arithmetic/algebraic reasoning, SAT solvers, integer linear programs) used to achieve good empirical performance -- a few seconds to prove correctness programs of 100 lines of code or more, and counterexamples in the case of incorrect code.
Constraint programming was next, and he gave the example of programming red-black tress via constraints: using as constraints the five properties a red-black tree must have. Insertion, for instance, would be encoded as a combination of the five properties, plus the fact that the new tree must have the elements of the old tree plus the inserted item. Although the performance was not great, he stressed that the applications of constraint programming are more along the lines of prototyping new data structures, test case generation, etc. One cool trick was using the fact that a solution is a counterexample of the negation of the constraints...reducing the problem to one in program verification.
Finally, he covered most extreme problem: writing code based on a specification. Here he had some really nice examples, generating code for converting dates and times, simple data structures like sorted lists, etc. in just a few seconds. Many ideas were used, but I was especially impressed by his mention of their work on floating point representations, synthesizing programs that avoided rounding errors.

Andreas Björklund and Thore Husfeldt, Shortest Two Disjoint Paths in Polynomial Time

Andreas gave the talk for this Best Paper award winner in Track A. The problem considered is a very simple one: given an undirected graph and two pairs of vertices (s1, t1) and (s2, t2), find a pair of disjoint paths whose total length is minimized. Several problems near to this one are NP-hard (e.g. minimizing the maximum of the two path lengths), so placing the problem into P is the major thrust of the paper -- the algorithms given run in O(n^11) and O(n^22) time for vertex- and edge-disjoint paths, respectively.
The approach of the algorithm is almost entirely matrix-based, focusing on solving a cycle cover problem on the input graph with self-loops added to each vertex...except that doesn't quite work. Andreas did a great job of leading us on a journey where obstacle after obstacle to the approach was eliminated. Many of the issues relate to the complexity of computing permanent...indeed one of the major technical contributions is proving that computing the permanent of a matrix over a certain type of ring is in P. In the end, the necessary permanent computation is replaced with a polynomial number of determinant computations, giving the large (but polynomial) running time.
In perhaps the most natural course of events, someone asked after the talk whether the technique might work for the shortest three disjoint paths. So natural was the question that a followup slide for the question was already prepared. The conclusion was "probably not", but Andreas posed a first problem to try in this direction: given an undireced graph, compute the parity of the number of cycle covers having c cycles, with c divisible by 4.

Mitsuru Kusumoto and Yuichi Yoshida, Testing Forest-Isomorphism in the Adjacency List Model

Mitsuru started by introducing the notion of property testing on graphs: computing whether the graph has a given property, answering correctly at least 2/3rds of the time while using only a small number of queries on the graph. In this setting, even the representation of a graph as an adjacency matrix or list matters, as checking whether a given edge is present takes too long with an adjacent list repesentation.
Kusumoto and Yoshida considered the problem of testing whether two forests are isomorphic. It was known that in general graphs this problem requires Omega(sqrt(n)) queries to the graphs, and only O(1) if the graphs have bounded degree. For forests, they achieve an algorithm that uses only O(polylog(n)) queries and prove Omega(sqrt(log(n)) are needed. The algorithm works by first performing a special type of partitioning of the input graph, then testing whether each subgraph is isomorphic. The partitioning is done in a way that ensures that if the two input graphs are significantly far away from being isomorphic, then some pair of their subgraphs in their partitions are too.

Rom Aschner, Matthew J. Katz, Bounded-Angle Spanning Tree: Modeling Networks with Angular Constraints

Rom gave a nicely visual talk about a problem in wireless network design in the plane. The input is a set of points, and the output is a (Euclidean) minimum spanning tree (MST) with an additional constraint: the smallest angle spanning all of the edges is at most some fixed constant b (a b-MST). Another way to think about this constraint is that the largest angle between any pair of adjacent incoming edges must be at least 2*π - b.
For b < π / 3, a b-MST may not even exist: three points at the vertices of an equilateral triangle require at least one of the points to have two incoming edges with interior angle π / 3. For b >= 8 π / 5, the b-MST is simply the minimum spanning tree, as the upper degree bound of 5 on every node in an MST ensures the smallest angle spanning all of the edges is at least 2 * π - 2 π / 5 = 8 π / 5. Finally, for b < 1/2, there exists a simple family of examples (n-1 points placed collinearly ε distance apart and a final point distance 1 from the shared line) that causes the b-MST to have total weight ε * (n - 1) + 1 and b-MST to have total weight 1 * (n - 1) = n - 1. So the non-trivial b are those in the interval [1/2, 8/5*pi).
For these, Aschner and Katz prove both NP-hardness and approximation results. They prove that finding the b-MST for b = π and b = 2 π / 3 are both NP-hard, using a reduction from Hamiltonian path in square and hexagonal grid graphs, respectively. They also achieve constant-factor approximation algorithms for b = π, 2 π / 3, π / 2. The approximation algorithms use a common sequence of steps:

Compute the MST.
Turn the MST into a Hamiltonian cycle via a tour around the boundary of the MST.
Decompose the points into consecutive groups along the cycle.
For each group, compute a set of edges connecting the points and minimizing total edge length.
Connect adjacent groups using the shortest edge possible.

Step 4 requires a careful selection of group sizes, as larger groups give greater potential for shorter collections of edges, but for large enough b, actually connecting the points may not be possible. The resulting approximation ratios are 16 and lower, and can further be reduced by a neat pigeonhole argument regarding shofting the groupings along the tour.

Canal Tour and Conference Dinner

The day concluded with a tour through the canals of Copenhagen and dinner at the Odd Fellow Palace. The 2014 Gödel Prize was awarded to Ronald Fagin, Amnon Lotem, and Moni Naor. Moni gave a great historical and technical talk on the work for which they received the award, including the threshold algorithm. Ronald Fagin also spoke briefly about Michael Franklin, the "missing fourth author" of the work, and his role in facilitating the successful collaboration.

Wednesday, July 09, 2014

July 8, 2014 — ICALP Review (guest post by Clément Canonne)

Here is a first guest post from ICALP 2014 kindly written by Clément Canonne. Enjoy it! (This post is also available in PDF. Any typo or typesetting issue with the HTML version is my responsibility.)

On DNF approximators for monotone Boolean functions (Eric Blais)

In this talk, Eric Blais presented joint work with Johan Håstad, Rocco Servedio and Li-Yang Tan, concerned with Boolean functions. More precisely, "the simplest representations of functions": DNF (Disjunctive Normal Form) formulas.
For a little bit of background, recall that a Boolean function f : {0,1}ⁿ →{0,1} defined on the hypercube [2] is a DNF if it can be written as

that is as an OR of AND’s. One can also see such functions as being eactly those taking value 1 on an "union of subcubes" (if One prefers. I will not argue with One).

A nice property of DNF formulas is that they are arguably amongst the simplest of all representations of Boolean functions; while formulas of depth ≥ 3 are a nightmare, DNFs have been extensively studied, and by now "Everything is known about them". Well, almost everything.

Indeed, amidst other facts, we have that

Theorem 1 (Folklore). Every Boolean function can be computed by a DNF of size 2^n-1.

Theorem 2 (Lupanov ’61). This is tight (PARITY_n needs that much).

Theorem 3 (Korshunov '81, Kuznetsov '83). A random Boolean function can be computed by DNFs of size Θ(2ⁿ∕log n) (and requires that much).

So... are we done yet? The mere presence of Eric in the auditorium was a clear hint that all was not settled. And as it turns out, if the picture is well understood for exact computation of Boolean functions by DNFs, what about approximate representation of a function? That is, what about the size required to approximate a Boolean function by a DNF, if one allows error ε (as a fraction of the inputs).

This leads to the notion of DNF approximator complexity; and here again some results – much more recent results:

Theorem 4 (Blais–Tan ’13). Every Boolean function can be approximated by a DNF of size O(2ⁿ∕log n). Furthermore, our all friend PARITY_n only needs DNF size O(2^(1-2ε)n).

That’s way better than 2^n-1. So, again – are we done here? And, again... not quite. This brings us to the main point of the paper, namely: what about monotone functions? Can they be computed more efficiently? Approximated more efficiently? (Recall that a Boolean function f is monotone if x ≼ y implies f(x) ≤ f(y), where ≼ is the coordinate-wise partial order on bit-strings.)
As a starter: no.

Theorem 5 (Folklore) Every monotone Boolean function can be computed by a DNF of size O(2ⁿ∕n^1/2) (using subcubes rooted on each min-term); and again, this is tight for PARITY.

Furthermore, and quite intuitively, using negations does not buy you anything to compute a monotone function (and why should it, indeed?):

Theorem 6 (Quine ’54). To compute monotone Boolean functions, monotone DNFs are the best amongst DNFs.

Not surprising, I suppose? Well... it’s a whole new game when one (one, again!) asks only for approximations; and that’s the gist of the paper presented here. First of all, drastic savings in the size of the formulas!

Theorem 7 (Blais–Hastad–Servedio–Tan ’14). Every monotone Boolean function can be approximated by a DNF of size O(2ⁿ∕2^{Ω(n^1/2)} ).

Eric gave a high-level view of the proof: again, it works by considering the subcubes rooted on each min-term, but in two steps:

Regularity lemma: the world would be much simpler if all subcubes were rooted on the same level of the hypercube; so first, reduce it to this case (writing f = f₁ + .....+ f_k, each f_i has this property)
then, approximate independently each f_i, using a probabilistic argument (via random sampling), to prove there exists a good approximator for all f_i’s, and then stitching them together.

And they also show it is tight: this time with the majority function MAJ_n. The proof goes by a counting argument and concentration of measure on the hypercube (every or almost every input is on the middle "belt" of the hypercube; but each subcube thus has to be rooted there, and each cannot cover too much... so many are needed)

So, approximation does buy us a lot. But clearly, using negations shouldn’t, should it? Why would allowing non-monotone DNF’s to approximate monotone functions ever help? (Hint: it does.) (Yep.)

Theorem 8 (Blais–Hastad–Servedio–Tan ’14). For every n, there exists ε_n and f : {0,1}⁶ⁿ →{0,1} such that

f can be ε_n-approximated by DNFs of size O(n);
any monotone DNF ε_n-approximating f must have size Ω(n²).

(Take that, intuition!)

The upshot: exact computation and approximate computation have intrinsically very different properties!

Eric then concluded with an open question: namely, how to improve/better understand the gap between approximating functions with monotone DNF vs. approximating them with general DNF’s (the currently known gap in the size being quite huge – almost exponential). Additionally, how to get a separation as in the mind-boggling theorem above, but changing the quantifiers – that is, for a constant ε independent of n?

Also, can someone fix my intuition? I think it’s broken.

[1] http://www.cs.columbia.edu/~rocco/papers/icalp14.html
[2] Not this one.

Day 1 of ICALP 2014 (guest post by Andrew Winslow)

Andrew Winslow was one of the colleagues who kindly answered my call for guest bloggers from ICALP 2014. Here is guest post on the first conference day. Enjoy it!

Conference Welcome

Local organizing committee chair Thore Husfeldt kicked off ICALP by mentioning a few stats about IT University of Copenhagen (ITU), where the conference is located:

Started in 1999(!)
2000 students.
10 theory CS faculty (mostly working in "track B" areas in ICALP parlance)
1 building.

He also noted that ITU is separate from two other universities in Copenhagen hosting TCS conferences this month: Copenhagen University (hosting SEA last week) and Technical University of Denmark (hosting SWAT last week).

Invited Talk: Claire Matthieu

Claire presented work entitled "Homophily and the Emergence of a Glass Ceiling Effect in Social Networks", joint work with Chen Avin, Zvi Lotker, Barbara Keller, David Peleg, and Yvonne-Ann Pignolet. She started the talk by asking why there's only one female Danish scientist (Inge Lehmann) listed on the Wikipedia page for famous Danish scientists out of about 60 entries -- why so rare? Looking at the DBLP database data as a social network with co-authorship edges gives similarly bleak results: the induced subgraphs of men and women in the top 1000 nodes shows a large, well-connected graph and a disconnected, sparse graph, respectively.
Claire and co-authors developed a formal model of growing co-authorship social networks, including assumptions about what a "glass ceiling effect" means for these networks, and proved that three properties are minimal sufficient set of properties to imply the effect. These properties are:

Preferential attachment: a "rich get richer" effect where incoming students choose to work an advisor with probability proportional to the number of co-authors the advisor has.
Minority: an incoming student is female with probability r < 0.5.
Homophily: students stick with an advisor of the same gender with probability 1, otherwise find a new advisor with probability p < 1.

Incoming students are iteratively added to the graph to form growing random graphs, repeatedly chosing new advisors according to homophily and preferential attachment. The glass ceiling effect is defined by looking at all nodes with degree k, where k = k(n) chosen so that the number of such nodes goes to infinity with n, and examining the ratio of women to men in this set. If the expected ratio of female over male nodes goes to 0 as the number of nodes goes to infinity, then the system is said to have a glass ceiling effect for the parameters r, p chosen. They are able to prove that the above three properties imply the class ceiling effect, and removing any one of them does not. The model was also verified experimentally using the DBLP database.
As a final twist, Claire considered modifying the model to account to capture the "men are intrinsically better" hypothesis, giving new male students two advisors rather than one. This was proven not to cause the glass ceiling effect. I was really struck by the meta-ness of the talk and how natural the problem seems. In the formal study of (social) networks, isn't the most natural network of study the one that directly impacts our ability to formally study networks?

György Dósa and Jiří Sgall. Optimal Analysis of Best Fit Bin Packing.

Dösa and Sgall complete the analysis of two classic (online) algorithms for the standard bin packing problem: given a collection of items specified by numbers in [0, 1], assign these items to the fewest number of bins where each bin has an item total of at most 1. The two algorithms, first-fit and best-fit, process the items one-by-one, placing each item permanently into an existing bin if possible, otherwise placing the item into a newly created bin. First-fit and best-fit differ only in the criteria by which they place item into existing bins, using the oldest bin (first-fit) or bin that would have the least remaining room after the item is placed into it (best-fit). It was previously shown that for every instance with optimal solution using OPT bins, first-fit and best-fit use ceil(1.7*OPT) bins and there exist instances where both use 1.7*OPT bins. Dösa and Sgall improved the upper bound to 1.7*OPT bins, dropping the ceiling and completing the analysis of the competitive ratio of these algorithms. Jiří commented that he felt that such classic algorithms deserved a complete analysis and I agree; the talk had an air of finality, having put the 40-year-old gap to rest.

Hossein Esfandiari, Mohammadtaghi Hajiaghayi, Reza Khani, Vahid Liaghat, Hamid Mahini and Harald Räcke. Online Stochastic Reordering Buffer Scheduling.

Vahid Liaghat gave this talk on a scheduling problem where items have types from a fixed universe and enter into a buffer of sized size k. One can then process all items of a type, flushing them from a buffer and causing new items to take their place. The cost of processing a sequence of items is either the number of processing steps used (block operation model) or number of times the type processed changes (standard model). It was known that a O(loglog(k))-competitive algorithm is possible for the adversarial/worst-case. This work studies the sitation where the input is not adversarial, but is randomly sampled from some set. Esfandiari et al. achieve an average-case O(1)-competitive algorithm for both the block operation and standard models for the case where an adversary designs an input that is then randomly permuted. The analysis uses several lemmas regarding the optimal solutions of various buffer sizes as well as threshold algorithms: algorithms that process each type as soon as a specific number of occurrences of a type is exceeded.

Erik D. Demaine, Yamming Huang, Chung-Shou Liao and Kunihiko Sadakane. Canadians Should Travel Randomly.

This paper attracted a few comments related to the curious title. The title comes from the "canandian traveler problem" of study, introduced by Papadimitriou and Yannakakis in 1991. The problem is a shortest path problem with partial information : some of the edges of the path are blocked (by intense canadian snowfall) and cannot be traversed, but this is only known after a traveler has visited an endpoint of the blocked edge. The parameterized version of the problem has k blocked edges, and the goal is find an algorithm with good competitive ratio compared to the shortest path that avoids all blocked edges. For deterministic algorithms, tight 2k+1 upper and lower bounds on the ratio are known, while for randomized algorithms, only a lower bound of k+1 (and implied upper bound of 2k+1) were known.
Demaine, Huang, Liao, and Sadakane improve the best-known randomized algorithm in two ways, giving a polynomial-time algorithm that achieves a ratio of (2 - o(1))k + 1, and a pseudo-polynomial-time algorithm (parameterized by k, the number of blocked edges) that achieves a (1 + sqrt(2)/2)k + 1 ratio. The ideas for both algorithms are to repeatedly compute a collection of nearly shortest paths, traverse them each until blocked, and repeating the process with the new information if none of the paths lead to reaching the destination. A careful selection of the number of paths used then yields the bound. The main difference between the two algorithms is exactly what short paths are used; the pseudo-polynomial-time algorithm uses a list of shortest paths, whereas the polynomial-time algorithm only finds paths within a constant factor of the shortest.

Dmitry Gavinsky and Shachar Lovett. En Route to the log-rank Conjecture: New Reductions and Equivalent Formulations.

Dmitry started by reviewing two-party communication problems, where the goal is to compute a combined Boolean function f of inputs x and y held by the two parties while passing the fewest number of bits between the parties. One can consider the matrix defined by the two inputs, with an entry for every (x, y) pair, called the Boolean matrix. The rank of this matrix (and Boolean function f) is the minimum number of rectangles found in some partition of the matrix into rectangular blocks of entries with the same value. It is known that if a c-bit communication protocol is possible, then rank of the matrix is at most 2^c, i.e. the communication complexity (in bits) is bounded from below by log(rank(f)). The log-rank conjecture is that polylog(rank(f)) is an upper bound on the communication complexity of computing f.
Part of this work by Gavinsky and Lovett is to provide a number of equivalent versions of the problem that may provide a way to make progress on the log-rank conjecture. For instance, they show that proving the log-rank conjecture for the randomized communication complexity on low-rank functions implies the log-rank conjecture for the deterministic version, as does proving the conjecture for the information cost of f, the minimum amount of information given by each party the other party in the worst-case.

EATCS General Assembly

The day ended with the EATCS general assembly led by Luca Aceto and appearances by the many people involved in ICALP and EATCS. ICALP 2015 will be in Kyoto, Japan, and some information about travel, accomodations, and Kyoto were presented by Kazuo Iwama, and ICALP 2016 will be in Rome, Italy. Some statistics about ICALP 2014 were given by Elias Koutsoupias -- the number of submissions is up from previous years, acceptance rate is about steady at 30%, and the vast bulk of the submissions were to to Track A.
As one final thought from the day, the organization of the conference so far has been top-notch. Many of the talks are online (including Claire Matthieu's), the sessions start and end on time, sound and projection issues are non-existent, and the food/drink is tasty (and rotates every break...).

Monday, July 07, 2014

In Copenhagen for ICALP 2014

It is good to be back in lovely Copenhagen and breathe the atmosphere of the Danish capital city.

One of the first things I saw when arriving in Kongens Nytorv was a smiling guy showing a sign to cyclists and car drivers that read "Husk; du er smuk" (Remember; you are beautiful).

The Carsten Dahl Trio opened the Copenhagen Jazz Festival events at the legendary Jazzhus Montmartre (which can host 150 people) with a lunch concert to launch their new recording.

The ICALP welcome reception was an opportunity to meet colleagues, thank the organizing committee for all the work they have done in preparing the conference and to enjoy the band De Græd, which created a jazzy and relaxed ambience.

The conference kicks off tomorrow at 6:30am GMT with the invited talk Homophily and the Glass Ceiling Effect in Social Networks by Claire Mathieu (ENS Paris). The talk will be streamed live as will all the other invited talks and most of the other plenary events.

Watch this space for some reports on ICALP 2014.

Friday, July 04, 2014

Music composed for the ICALP 2014 Gödel Prize ceremony gets coverage in Nature!

The piece of music composed for the ICALP 2014 Gödel Prize ceremony gets coverage in Nature! See also the video presenting the composition (in Danish, but with English sub-titles).

Thanks to Thore Husfeldt for adding this further twist to the award ceremony for the Gödel Prize! I am looking forward to the event very much.