Florian Stilz (profile picture)

Florian Philipp Stilz

PhD Student @ CAMP Lab, Technical University of Munich & PhD Student @ CAMMA Lab, University of Strasbourg

Currently, I am pursuing a PhD in computer science with focus on multi-modality foundation models for intraoperative surgical procedures at the Technical University of Munich and University of Strasbourg, which is co-supervised by Prof. Nassir Navab and Prof. Nicolas Padoy. My main research interests are utilizing modern Deep Learning methods for the areas of 3D Computer Vision. E.g. Neural Rendering and 3D/4D Reconstruction as well as Multi-modal Deep Learning for both Natural Language Processing and Computer Vision.

florian.stilz@tum.de

More Research

Learning World Models by Self-supervised Exploration
Technical University of Munich
Oct 2022 - Feb 2023 Munich, Germany
Supervisor: Lennart Röstel
We worked on an adapted version of Plan2Explore, where an agent builds a world model trained in a self-supervised manner. The model was tested on the Stacker task in the DeepMind Control Suite
Project
Transposition Equivariant Music Key Signature Estimation
Technical University of Munich
May 2022 - Jan 2023 Munich, Germany
Supervisor: Dr. Vladimir Golkov
I developed a global key signature estimation model for audio files. It additionally predicts the tonic as well as the genre of a music piece. The final architecture is called PitchClassNet and is music transposition equivariant by design. The model achieves competitive results with only a fraction of parameters compared to other state-of-the-art methods.
Project
Scientific Paper Classification using Visual and Textual Features
Technical University of Munich
April 2022 - Oct 2022 Munich, Germany
In this project I worked on analyzing scientific publications by performing several classification tasks like Citation/Year, Year, and Category. The tasks are achieved by utilizing both visual and textual features. The visual features are the front page and figures from a given publication paper. The textual features are the title and the abstract of a given publication. In addition an entirely new dataset was generated with more than 200k publications.
Project
3D Visual Grounding with Transformers
Technical University of Munich
April 2022 - July 2022 Munich, Germany
Supervisor: Dave Zhenyu Chen
This work focuses on developing a transformer architecture for bounding box prediction around a target object that is described by a natural language description, beating the starting baseline by more than 2% IoU accuracy.
Project
Segmentation of Medical Records with Natural Language Processing Tools
Technical University of Denmark
Jan 2021 - May 2021 Lyngby, Denmark
Supervisor: Prof. Ole Winther
My B.Sc. thesis on segmenting medical abstracts via Named Entity Recognition. The main focus was put on using different BERT architectures to classify important medical classes like e.g. "Diagnosis", "Symptoms", and so on within medical abstracts for rare diseases
Project