Geometry of Big Data – Tuesday session

All talks are summarised in my words which may not accurately represent the authors’ opinion. The focus is on aspects I found interesting. Please refer to the authors’ work for more details.

Session 1 – Graph-based persistence

The talk On the density of expected persistence diagrams and its kernel based estimation is given by Frederic Chazal. A draft is available on arxiv.

Grow circles around point data to generate a graph whenever other points meet the circle and produce a persistent homology of filtered simplicial complexes (e.g adding edges to possibly change homology). Persistent barcode and persistence diagrams encode the same information produced by this process.

Measures are nicer to work with than sets of points for statistical purposes. If the persistence diagram D is a random variable, then E[D] is a determnistic measure on R². Persistence images reveal E[D] and are more interpretable than persistance diagramms which may be too crowed for visual inspection with a large sample.

Persistence can be used as an additional feature on a dataset. For example, a random sample from the data set can be taken and the persistence diagram/image can be computed and compared between random samples giving us an idea of the stability of the homology.

Session 2 – Log-concave density estimation

The talk Log-concave density estimation: adaptation and high dimensions is given by Richard Samworth. The paper is available at Project Euclid.

To randomly sample a density f_0 there are generally two appraoches parametric and non-parametric methods. A density f is log-concave if log f is concave. The super level sets need to be convex. Univariate examples are normal, logistic and more. The class is closed under marginalisation, conditioning, convlution and linear transformations.

In an unbounded likelihood, the density surface is spiky. The log-concave density addresses this.

Session 3 – Infinite Width Neural Nets

The talk Infinite-Width Bounded-Norm Networks: A View from Function Space given by Nathan Srebro has two parts Infinite Width ReLU Nets and Geometry of Optimization Regularization and Inductive Bias.

Part 1: When we are learning we find a good fit (of weights) for the data. What kind of functions can be approxmiated by Neural Net? Essentially all, but the question is how large does the network have to be to approximate f to within error e. The question should be: what class function can be approximated by low norm Neural Nets? Another question should be: Given a bounded number of units what norm is required to approximate f to within any error e? The cost of the weights is taken as the parameter. This results in linear splines. A neural net with infinite width and one hidden layer solves the Green’s function.

Part 2: How does depth influence this? Deep learning should be considered with infinitive width and implemented with a finite approximation. Deep learning focuses on searching parameter space that maps into a richer function space.

Session 4

The talk Some geometric surprises in modern machine learningis given by Andrea Montanari.

Session 5

The talk Multi-target detection and cryo-EM imaging by autocorrelation analysis is given by Amit Singer.

Session 6

The talk Learning to Solve Inverse Problems in Imaging is given by Rebecca Willett.

Geometry of Big Data – Monday session

All talks are summarised in my words which may not accurately represent the authors’ opinion. The focus is on aspects I found interesting. Please refer to the authors’ work for more details.

Session 1 – Learning DAGs

The talk DAGs with NO TEARS: Continuous Optimization for Structure Learning is given by Pradeep Ravikumar. A draft is available on arxiv.

Learning directed acyclical graphs (DAGs) can traditionally be done in two ways: conditional independence and score-based . The latter poses a local search-problem with out a clear answer. More recently the problem has been posted as a continuous (global) optimisation for undirected graphs.

A loss function is a log-likelihood of the data and we need to find the most appropriate W such that X = XW + E. They provide a new M-estimator.

Session 2 – Parallel transport for data alignment

The talk Data Analysis with the Riemannian Geometry of Symmetric Positive-Definite Matrices given by Ronan Talmon. A draft is available on arxiv.

The talk focuses on how to align data when the intersubject variation is large but consistent and the intrasubject variation could be mapped. Parallel transport has the goal to align the intersubject values on an symmetric positive definite (SPD) embedding in n-dimensional space. SPD matrices are embedded on a hyperbole and all computations can be performed in closed-form.

Data from multiple subject and multiple session, it does not matter whether to first adapt the sessions or the subject – which only works for parallel transport and not with identy transformations.

Session 3 – Persistence framework for data analysis

The talk Metric learning for persistence-based summaries and application to graph classification is given by Yusu Wang. An underlying paper is available on PlosOne.

Persistence diagrams can be used to describe complexity. The features are simpler but persistent to the underlying object. A geometric object through a filtration perspective produces a summary. Filtration is a growing sequence of spaces. The time that sets get created and destroyed can be mapped onto a persistence diagram with death time on the y axis and birth dime on the x-axis.

The bottleneck distance is a matching between two persistence diagram such that each feature is matched with the shortest distance. Features may be matched to a zero-feature (capturing noise) if they are to close to the diagonal. More complex approaches include persistence images that transform the diagram (after transforming it) into a kernel density.

The weight function should be application dependent and thus can be learned instead of pre-assigned. We can just take the difference between two persistence images as a weighted kernel for persistence images (WLPI).

For graphs the following metrics can be used for persistence. The Discrete Ricci curvature captures the local curvature on the manifold. The Jaccard index function compares for nodes who has common neighbors which is good for noisy networks.

In general, a descriptive function must be found for the domain and may even encode meaningful knowledge on how the object behaves. High weights would describe the more distinct features.

Session 4 – Behold the spikes

The talk Proper regularizers for semi-supervised learning is given by Dejan Slepcev.

A d-dimensional point cloud can be converted to a graph representation using a kernel that connects close edges (with a fall-off or discontinuity). As the number of nodes n goes to infinity, the kernel bandwidth should shrink to 0.

The error bandwidth is critical. The take-away is that instead of producing single labeled data points, the label should be extended beyond the kernel bandwidth. A single data label can produce spikes because essentially the minimiser obtains smaller values for a flat surface with a single spike than for an appropriate surface.

Session 5

The talk Solving for committor functions in high dimension is given by Jianfeng Lu.

Session 6 – Finding structure in loss

The talk A consistent framework for structure machine learning is given by Lorenzo Rosasco.

Structured machine learning is not structure learning. It refers to learning functional dependencies between arbitrary output and input data. Classical approaches include likelihood estimation models (struct-svm, conditional random fields, but limited guarantees) and surrogate approaches (strong theoretical guarantees but ad hoc and specific).

Applying empirical risk minimisation (ERM) from statistical learning we can expect that the mean of the empirical data is close to the mean of the class. However, it is hard to pick a class. The inner risk (decomposing into marginal probability) reduces the class size. Making a strong assumption the structured encoding loss function (SELF) requires a Hilbert space and two maps such that the loss function can be presented as an inner product. Using a linear loss function helps. For a crazy space Y (need not be linear) the SELF gives enough structure to proceed. This enlarges the scope of structured learning to inner risk minimisation (IRM).

There is a function psi hidden in the loss function that encodes and decodes from Y to the Hilbert space. The steps are encode Y in H, learn from X to H, and decode H to Y. In linear estimation with least squares, the encoding/decoding disappears and the output space Y is not needed for computation.

About Ujung Kulon: The final Chapter

We arrived in Tamanjaya thinking our car would await us and we would go straight back home. Far from the truth this thought was. Our driver hadn’t arrived yet, but we didn’t worry yet. We did our business like paying for the guide and the boat and getting a free lunch – to get the boat (and the lunch) had been another story, we met the owner and talked to him when we first arrived in Tamanjaya four days earlier. First we talked about general things, then about personal things, last about the boat. We drunk coffee as we talked and later bargained. It was a hard deal, but in the end the extras made the deal. We would have a dinner at his house once we came back and some fresh coconuts – we paid the boat and went back to the ranger station and waited for our driver to arrive.  But as the hours passed by and nobody arrived we started to worry. It happened that we had chartered our boat from the Sunda Jaya Homestay and as nobody arrived the owner invited us to stay with him. He offered us to stay for free one night since our driver didn’t come. So we sit down with him and ate dinner. We talked a lot and had a good time. We would have to stand up early in the morning to catch the first bus back to Jakarta. So we prepared to go to bed early and as we wanted to go to sleep, our driver arrived. Happy as we could be we packed our things, thanked the owner of the Sunda Jaya Homestay for the hospitality and left. Our driver had taken one wrong turn on his way to Tamanjaya which had cost him several hours. At 3 a.m. in the morning we were back in Bogor. An adventure had ended more adventurous as expected.

Als wir in Tamanjaya ankamen, dachten wir unser Auto würde auf uns warten und wir würden geradewegs nach Hause fahren. Wir lagen sehr falsch. Unser Fahrer war noch nicht angekommen, aber wir sorgten uns noch nicht. Wir hatten eh noch einges zu erledigen. Wir bezahlten den Fremdenführer und das Boot, außerdem war da noch dieses kostenlose Mittagessen – das Boot zu mieten (und das Mittagessen zu bekommen) war noch einmal eine Geschichte für sich, wir traffen den Bootsbesitzer als wir in Tamanjaya vier Tage zuvor ankammen. Man unterhielt sich, erst über sehr allgemeine Dinge, dann über persönlicheres und plötzlich war man beim Boot angelangt. Zu den Verhandlungen gab es Kaffee. Wir feilschten hart und am Ende waren die Dreingaben entscheidend. Wir bekammen ein Mittagessen bei unserer Rückkehr und frische Kokosnüsse zum trinken – so bezahlten wir also unser Schiff und gingen zurück zur Rangerstation um auf den Fahrer zu warten. Aber die Stunden vergingne und niemand kam, so begannen wir uns Sorgen zu machen. Glücklicherweise war der Bootsbesitzer zugleich auch der Besitzer des Sunda Jaya Homestays und da unser Fahrer nicht kam, bot er uns an bei ihm zu bleiben. Wir könnten eine Nacht um sonst übernachten, da unser Fahrer nicht auftauche. Wir setzten uns mit ihm zusammen und aßen zu Abend. Wir redeten viel und hatten einen unterhaltsamen abend. Da wir früh aufstehen müssten um rechtzeitig den bus nach Jakarta zu erwischen, wollten wir früh ins Bett. Gerade als wir uns hinlegen wollten, kam unser Fahrer an. Überglücklich packten wir unsere Sachen zusammen, bedankten uns bei Besitzer des Sunda Jaya Homestays für seine Gastfreundschaft und fuhren ab. Unser Fahrer hatte einmal eine falsche Abbiegung genommen, was ihn mehrere Stunden gekostet hatte. Um 3 Uhr morgens kammen wir in Bogor an. Ein Abenteuer, das abenteuerlicher geendet hatte als erwartet.

Cuando llegabamos a Tamanjaya pensabamos que solamente tuvieramos que sentarnos en el carro y nos pudieramos irnos a casa. Pero no pasó asi. Nuestro Conductor no llegaba pero aún no nos preocupabamos. Primero tuvimos que hacer unas cosas. Pagar el guía y el bote, comer el almuerzo – rentar ese bote ( y obtener el almuerzo gratis) es otra historia. Nos encontrabamos con el dueño del bote cuando llegabamos en Tamanjaya cuatro días antes. Hablabamos, primero en general, luego sobre cosas privadas y al final del bote. Todo el tiempo tomabamos café. Barateabamos por un rato y finalmente nos quedamos con un precio aceptable y un almuerzo y cocos frescos – asi que pagabamos el bote y regresabamos a la estación de los rangeres para esperar el conductor. Las horas pasaron y nadie llegó y al final empezabamos preocuparnos. Por suerte el dueño del bote era al mismo tiempo el dueño del Sunda Jaya Homestay y nos invitó a quedarnos en el homestay ya que nuestro conductor no llegó. Cenabamos con el y hablabamos por un rato. Era un buen tiempo. El bus a Jakarta sale temprano así que nos queríamos dormir temprano. Al estar listo para dormirnos el conductor llego. Muy felices cogimos nuestras cosas, dijimos gracias al dueño del Sunda Jaya Homestay por su hospitalidad y nos ibamos. El conductor se iba una vez por una mala dirección y le costaba unas horas. A las 3 de la mañana llegabamos a Bogor. Una aventura que terminó más aventurera que expectado.

[imagebrowser id=62]

Ujung Kulon: The Beach and the Moon

At the first shelter we had some hours to spend at the beach which we gladly did. Silently we sit there for hours, just watching the waves and the clouds passing by. It was silent at that beach, only the surf was to be heard. Inner peace comes upon you when you’re in such a place. Later on I walked the beach for a while, I found some crabs, curious plants and an indescribable atmosphere. With the sunset coming, the scenery was perfect. The colours ranged from brightest yellow to the deepest red. Hypnotized by the colors it took me a while to realize the moon. What a great first day in the wilderness.

Nachdem wir am ersten Unterstand angekommen waren, hatten wir etwas Zeit, die wir am Strand verbringen konnten und taten dies auch überglücklich. Dort saßen wir also, Stunde um Stunde, ohne Mucks. Wir beobachteten die Wellen und die Wolken die vorbeizogen. Es war still am Strand, nur die Brandung säuselte leicht. An solch einem Ort findet man den inneren Frieden. Später dann bin ich den Strand entlang gelaufen, ich begegnete Krebsen, seltsamen Pflanzen und einer unglaublichen Atmosphäre. Als der Sonnenuntergang dann anfing, war die Szene perfekt. Die Farben reichten vom hellsten Gelb zum tiefsten Rot. Verzaubert durch die Farben, bemerkte ich den Mond erst nach einer Weile. Was für ein toller erster Tag in der Wildnis.

Llegando al primer refugio tuvimos unas horas que pasar por la playa y lo hicimos con gusto. Silenciosamente estabamos sentados por alla por horas simplemente observando las olas y las nubes pasandonos. No se escuchaba nada más que el oleaje. En aquellos lugares uno encuentra la paz en si mismo. Luego caminaba por la playa y encontró a cangrejos, plantas raras y una atmósfera indescriptible. Cuando el sol atadeció se perfeccionó el escenario. Los colores iban del amarillo luminoso al rojo oscuro. Encantado por los colores me costaba darme cuenta de la luna. Que buen primer dia en la selva.

[imagebrowser id=57]

About Temples: Taman Ayun

Since Bali is the Isle of the Gods, it’s kind of part of the deal to visit their temples, at least some of them. This time I take you to Taman Ayun. A temple surrounded by water to three sides. The holiest part is up on a small hill. You have to pass two terraces before you can get there. Behind the sanctum there is a park you may walk through. You might know that only Hindus may enter the holy parts of an temple. But you can walk around it and even may take pictures of it. I must say, I am really touched by this gateways. I like this tower-alike columns left and right to a passage. But check out the pictures to feel with me.

Da Bali die Insel der Götter ist, gehört es sich ihre Tempel zu besuchen. Zumindest ein paar der Tempel sollte man gesehen haben. Dieses Mal nehme ich euch mit zum Tempel Taman Ayun. Ein Tempel der zu drei Seiten hin von Wasser umgeben wird. Das Heiligtum liegt oben auf einem kleinen Hügel. Man muss über zwei große Terrassen schreiten, ehe man dorthin gelangt. Hinter dem Heiligtum liegt  ein Garten, durch den man wandeln kann.  Manche von euch werden wissen, dass nur Hindus das Heiligtum betreten dürfen. Aber man darf durch die anderen Teile des Tempels gehen und das Heiligtum fotografieren. Ich muss sagen, dass mir die hinduistischen Tore sehr gut gefallen. Ich liebe diese turmartigen Säulen am Rand eines Durchgangs. Aber schaut euch doch einfach die Bilder an, um zu sehen was ich meine.

Como Bali es la Isla de los Dioses, uno tiene que visitar sus templos, por lo menos algunos. Esta vez les muestro Taman Ayun. A tres lados del templo hay un lago.  El lugar sagrado esta encima de una colina.  Uno tiene que pasar dos terrazas por llegar a allá. Detrás del lugar sagrado hay un jardín en cual se puede caminar al gusto. Como algunos de ustedes deberían saber, solo hindús pueden entrar a un lugar sagrado. Pero se puede pasar por los otros lados del templo y se pueden tomar fotos. Yo tengo que admitir que me gustan mucho los portales hindús. me encantan los columnas que parecen a torres al lado de la entrada. Pero por favor, miran a las fotos para verlo por sis mismos.

[imagebrowser id=48]

Twitter Weekly Updates for 2010-03-28

Me and the books

For those who don’t know yet. I love to read! The secret is out. The proof? I read everywhere, have a look, also in Bali.

Für die, die es noch nicht wussten. Ich liebe es zu lesen! Das Geheimnis ist gelüftet. Der Beweis? Ich lese überall, schaut es euch an, sogar auf Bali.

Por los que aún no lo sabían. Yo amo leer! Se levantó la liebre. La prueba? Yo leo por todos lados, mirenlo, hasta en Bali.

[imagebrowser id=38]

Art in Singapore

We go back in time, at least a bit to keep up with Singapore. I know. We are all anxious to see more pictures from Bali. But one should finish a started story. One thing that did overwhelm me in Singapore was, that there where heaps of arts. Statues, fountains and other curious artistic monuments. I pictured some of them. In the business district, as in the city park or just somewhere in the town. Enjoy the great varity.

Wir gehen ein bisschen zurück in der Zeit, um uns noch einmal Singapur zuzuwenden. Ich weiß, ich weiß. Eigentlich brennen wir alle danach, endlich Bilder aus Bali zu sehen. Aber eine begonnene Geschichte sollte auch zu Ende gebracht werden. In Singapur wurde ich von der allgegenwärtigen Kunst gerade zu überwältigt. Statuen, Brunnen und andere kuriose Kunstwerke. Manche davon habe ich photographiert. Sowohl im Geschäftsviertel als auch im Park oder sonstwo in der Stadt. Genießt die Auswahl.

Retrasamos un poco en el tiempo hasta que volvemos a Singapore. Lo sé. Todos queremos ver y saber de Bali. Pero hay que terminar la historias ya empezadas. La cosa que más me soprendió de Singapur era que hay tanto arte allá. Estatuas, pozos y otras curiosidades. He tomado fotos de algunas. Tanto en el distrito de comercio como en el parque o otro lado que sea en la ciudad. Disfruten las fotos.

[imagebrowser id=36]