Project information

  • Source of funding: NCN OPUS
  • Number: 2020/37/B/ST6/03463
  • Amount: 652 800 PLN

Deep generative models for 3D representations

The topic of the projects is about deep learning generative models for 3D representations. The main idea behind that generative model is to create the model capable to create data from the true data distribution. In practical applications, where the data is represented by complex structures, like images, the problem of approximation true data distribution is a challenging task. The problem of training the generative models is well investigated in the literature. Usually, the deep generative models aim at transforming a random sample from the assumed prior distribution and transform the sample using a deep neural network to construct the sample from the data distribution. Various generative models are proposed in the literature. The most popular are Generative Adversarial Nets (GANs) that are known from their ability to generate good looking images. The main drawback of those groups of the models is a lack of likelihood estimation techniques). There is another group of generative models named Variational Autoencoders (VAEs). VAEs make an attempt to optimize the log-likelihood of the data using by approximating the true posterior with some inference distribution represented by an additional network. As a consequence, instead of maximizing the direct log-likelihood, it is possible to optimize Evidence Lower Bound (ELBO) which is an approximation of true likelihood value. The exact likelihood estimation can be achieved by the application of so-called Flow-based models. The central idea of this group of models is to apply the change of variable formula and map some simple prior distribution to some complex data distribution using flow-based transformations. That kind of approach assumes that we have a set of complex transformations that are invertible and the determinant of the Jacobian matrix can be easily calculated. In the project, we would like to design, implement, and validate deep generative models. The primary goal of generative models is to create the 3D representation of the objects previously unseen in training data. We are going to be focused on the following fields of generative models that were not explored well in literature:

  • Developing effective flow-based models for 3D representations. Current flow-based models are ineffective due to the huge amount of the parameters that are necessary to be trained in coupling layers, a very-long training process, and discursive quality of the generated samples. On the other hand, models like PointFlow, that utilize continuous normalizing flows are capable to learn complex shape representations but the training procedure is very time-consuming and the capacity of flow architecture is limited due to the need for application ODE solvers during training. During the project, we would like to address this limitation and propose a novel generative model that is going to use so-called hyper networks to predict the parameters of the flow. As a consequence, our approach will produce a simpler architecture of the flow, keeping the high quality of generated data but significantly reducing the training time and memory timestamp.
  • Designing new metrics for evaluation. The quantitative analysis of generative models is a quite challenging task. Therefore, we are going to provide a novel evaluation framework for evaluating the quality of the generated samples. The provided quality measure would be independent on the distance measure and is going to make use of the likelihood value of the generated data basing on the designed generative model.
  • Developing a novel generative model with compact binary latent representation that satisfies the following requirements: encode the 3D data examples in compact binary representation, reconstruct the objects from compact embedding and generate the new samples from multivariate Bernoulli distribution with some support of additional Gaussian noise.
  • Proposing the methods for incorporating additional partial evidence to the generative model without retraining the model architecture.
  • Exploring other representations then point clouds for 3D representations that are more focused on the surface of 3D objects.
  • Developing the model that allows reconstructing an object from a 2D image into a 3D shape.