Claim Ownership

Author:

Subscribed: 0Played: 0
Share

Description

 Episodes
Reverse
In this episode of the Talking Papers Podcast, I hosted Chamin Hewa Koneputugodage to chat about OUR paper "DiGS: Divergence guided shape implicit neural representation for unoriented point clouds”, published in CVPR 2022. In this paper, we took on the task of surface reconstruction using a novel divergence-guided approach.  Unlike previous methods, we do not use normal vectors for supervision. To compensate for that, we add a divergence minimization loss as a regularize to get a coarse shape and then anneal it as training progresses to get finer detail. Additionally, we propose two new geometric initialization for SIREN-based networks that enable learning shape spaces.  PAPER TITLE "DiGS: Divergence guided shape implicit neural representation for unoriented point clouds"  AUTHORSYizhak Ben-Shabat, Chamin Hewa Koneputugodage, Stephen GouldABSTRACTShape implicit neural representations (INR) have recently shown to be effective in shape analysis and reconstruction tasks. Existing INRs require point coordinates to learn the implicit level sets of the shape. When a normal vector is available for each point, a higher fidelity representation can be learned, however normal vectors are often not provided as raw data. Furthermore, the method's initialization has been shown to play a crucial role for surface reconstruction. In this paper, we propose a divergence guided shape representation learning approach that does not require normal vectors as input. We show that incorporating a soft constraint on the divergence of the distance function favours smooth solutions that reliably orients gradients to match the unknown normal at each point, in some cases even better than approaches that use ground truth normal vectors directly. Additionally, we introduce a novel geometric initialization method for sinusoidal INRs that further improves convergence to the desired solution. We evaluate the effectiveness of our approach on the task of surface reconstruction and shape space learning and show SOTA performance compared to other unoriented methods.RELATED PAPERS📚 DeepSDF  📚 SIREN LINKS AND RESOURCES💻 Project Page 💻 Code  🎥 5 min videoTo stay up to date with Chamin's latest research, follow him on:🐦 Twitter 👨🏻‍🎓LinkedInRecorded on April 1st 2022.CONTACTIf you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.comSUBSCRIBE AND FOLLOW🎧Subscribe on your favourite podcast app📧Subscribe to our mailing list🐦Follow us on Twitter🎥YouTube Channel#talkingpapers #CVPR2022 #DiGS #NeuralImplicitRepresentation #SurfaceReconstruction #ShapeSpace #3DVision #ComputerVision #AI #DeepLearning #MachineLearning  #deeplearning #AI #neuralnetworks #research  #artificialintelligence
 In this episode of the Talking Papers Podcast, I hosted Dejan Azinović to chat about his paper "Neural RGB-D Surface Reconstruction”, published in CVPR 2022. In this paper, they take on the task of RGBD surface reconstruction by using novel view synthesis.  They incorporate depth measurements into the radiance field formulation by learning a neural network that stores a truncated signed distance field. This formulation is particularly useful in regions where depth is missing and the color information can help fill in the gaps. PAPER TITLE "Neural RGB-D Surface Reconstruction"  AUTHORSDejan Azinović Ricardo Martin-Brualla Dan B Goldman Matthias Nießner Justus ThiesABSTRACTIn this work, we explore how to leverage the success of implicit novel view synthesis methods for surface reconstruction. Methods which learn a neural radiance field have shown amazing image synthesis results, but the underlying geometry representation is only a coarse approximation of the real geometry. We demonstrate how depth measurements can be incorporated into the radiance field formulation to produce more detailed and complete reconstruction results than using methods based on either color or depth data alone. In contrast to a density field as the underlying geometry representation, we propose to learn a deep neural network which stores a truncated signed distance field. Using this representation, we show that one can still leverage differentiable volume rendering to estimate color values of the observed images during training to compute a reconstruction loss. This is beneficial for learning the signed distance field in regions with missing depth measurements. Furthermore, we correct for misalignment errors of the camera, improving the overall reconstruction quality. In several experiments, we show-cast our method and compare to existing works on classical RGB-D fusion and learned representations.RELATED PAPERS📚 NeRF 📚 BundleFusion LINKS AND RESOURCES💻 Project Page  💻 Code  To stay up to date with Dejan's latest research, follow him on:👨🏻‍🎓 Dejan's personal page🎓 Google Scholar🐦 Twitter👨🏻‍🎓LinkedIn:Recorded on April 4th 2022.CONTACTIf you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.comSUBSCRIBE AND FOLLOW🎧Subscribe on your favourite podcast app📧Subscribe to our mailing list 🐦Follow us on Twitter 🎥YouTube Channel#talkingpapers #CVPR2022 #NeuralRGBDSurfaceReconstruction #SurfaceReconstruction #NeRF  #3DVision #ComputerVision #AI #DeepLearning #MachineLearning  #deeplearning #AI #neuralnetworks #research  #artificialintelligence
Yuliang Xiu - ICON

Yuliang Xiu - ICON

2022-04-1936:01

In this episode of the Talking Papers Podcast, I hosted Yuliang Xiu to chat about his paper "ICON: Implicit Clothed humans Obtained from Normals”, published in CVPR 2022. SMPL(-X) body model to infer clothed humans (conditioned on the normals).  Additionally, they propose an inference-time feedback loop that alternates between refining the body's normals and the shape. PAPER TITLE "ICON: Implicit Clothed humans Obtained from Normals"  https://bit.ly/3uXe6YwAUTHORSYuliang Xiu, Jinlong Yang, Dimitrios Tzionas, Michael J. BlackABSTRACTCurrent methods for learning realistic and animatable 3D clothed avatars need either posed 3D scans or 2D images with carefully controlled user poses. In contrast, our goal is to learn an avatar from only 2D images of people in unconstrained poses. Given a set of images, our method estimates a detailed 3D surface from each image and then combines these into an animatable avatar. Implicit functions are well suited to the first task, as they can capture details like hair and clothes. Current methods, however, are not robust to varied human poses and often produce 3D surfaces with broken or disembodied limbs, missing details, or non-human shapes. The problem is that these methods use global feature encoders that are sensitive to global pose. To address this, we propose ICON ("Implicit Clothed humans Obtained from Normals"), which, instead, uses local features. ICON has two main modules, both of which exploit the SMPL(-X) body model. First, ICON infers detailed clothed-human normals (front/back) conditioned on the SMPL(-X) normals. Second, a visibility-aware implicit surface regressor produces an iso-surface of a human occupancy field. Importantly, at inference time, a feedback loop alternates between refining the SMPL(-X) mesh using the inferred clothed normals and then refining the normals. Given multiple reconstructed frames of a subject in varied poses, we use SCANimate to produce an animatable avatar from them. Evaluation on the AGORA and CAPE datasets shows that ICON outperforms the state of the art in reconstruction, even with heavily limited training data. Additionally, it is much more robust to out-of-distribution samples, e.g., in-the-wild poses/images and out-of-frame cropping. ICON takes a step towards robust 3D clothed human reconstruction from in-the-wild images. This enables creating avatars directly from video with personalized and natural pose-dependent cloth deformation.RELATED PAPERS📚 Monocular Real-Time Volumetric Performance Capture https://bit.ly/3L2S4JF📚 PIFu https://bit.ly/3rBsrYN📚 PIFuHD https://bit.ly/3rymDiELINKS AND RESOURCES💻 Project Page https://icon.is.tue.mpg.de/💻 Code  https://github.com/yuliangxiu/ICONTo stay up to date with Yulian'gs latest research, follow him on:👨🏻‍🎓 Yuliang's personal page:  https://bit.ly/3jQb16n🎓 Google Scholar:  https://bit.ly/3JW25ae🐦 Twitter:  https://twitter.com/yuliangxiu👨🏻‍🎓LinkedIn: https://www.linkedin.com/in/yuliangxiu/Recorded on March11th 2022.CONTACTIf you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.comSUBSCRIBE AND FOLLOW🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikb...📧Subscribe to our mailing list: http://eepurl.com/hRznqb🐦Follow us on Twitter: https://twitter.com/talking_papers🎥YouTube Channel: https://bit.ly/3eQOgwPThis episode was recorded on February 11 2022.#talkingpapers #CVPR2022 #ICON #ImplicitHumans  #3DVision #ComputerVision #AI #DeepLearning #MachineLearning  #deeplearning #AI #neuralnetworks #research  #artificialintelligence
Itai Lang - SampleNet

Itai Lang - SampleNet

2022-03-2837:51

In this episode of the Talking Papers Podcast, I hosted Itai Lang to chat about his paper "SampleNet: Differentiable Point Cloud Sampling”, published in CVPR 2020. In this paper, they propose a point soft-projection to allow differentiating through the sampling operation and enable learning task-specific point sampling. Combined with their regularization and task-specific losses, they can reduce the number of points to 3% of the original samples with a very low impact on task performance. I met Itai for the first time at CVPR 2019.  Being a point-cloud guy myself, I have been following his research work ever since. It is amazing how much progress he has made and I can't wait to see what he comes up with next. It was a pleasure hosting him in the podcast. PAPER TITLE "SampleNet: Differentiable Point Cloud Sampling"  https://bit.ly/3wMFwllAUTHORSItai Lang, Asaf Manor, Shai AvidanABSTRACTand offered a workaround instead. We introduce a novel differentiable relaxation for point cloud sampling that approximates sampled points as a mixture of points in the primary input cloud. Our approximation scheme leads to consistently good results on classification and geometry reconstruction applications. We also show that the proposed sampling method can be used as a front to a point cloud registration network. This is a challenging task since sampling must be consistent across two different point clouds for a shared downstream task. In all cases, our approach outperforms existing non-learned and learned sampling alternatives. Our code is publicly available.RELATED PAPERS📚 Learning to Sample https://bit.ly/3vd1FZd📚 Farthest Point Sampling (FPS)  https://bit.ly/3Lkcyx9LINKS AND RESOURCES💻 Code  https://bit.ly/3NoS0pbTo stay up to date with Itai's latest research, follow him on:🎓 Google Scholar: https://bit.ly/3wCMY2u🐦 Twitter: https://twitter.com/ItaiLangRecorded on February 15th 2022.CONTACTIf you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.comSUBSCRIBE AND FOLLOW🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikb...📧Subscribe to our mailing list: http://eepurl.com/hRznqb🐦Follow us on Twitter: https://twitter.com/talking_papers🎥YouTube Channel: https://bit.ly/3eQOgwPThis episode was recorded on February 11 2022.#talkingpapers #SampleNet #LearnToSample #CVPR2020 #3DVision #ComputerVision #AI #DeepLearning #MachineLearning  #deeplearning #AI #neuralnetworks #research  #artificialintelligence
In this episode of the Talking Papers Podcast, I hosted Manuel Dahnert to chat about his paper “Panoptic 3D Scene Reconstruction From a Single RGB Image”, published in NeurIPS 2021.  In this paper, they unify the task of reconstruction, semantic segmentation and instance segmentation in 3D from a single RGB image. They propose a holistic approach to lift the 2D features into a 3D grid.  Manuel is a good friend and colleague. We first met in my research visit at TUM during my PhD, we spent some long evenings together at the office. We have both come a long way since then and I am really looking forward to seeing what he will cook up next. I have a feeling it is not his last visit in the podcast.PAPER TITLE "Panoptic 3D Scene Reconstruction From a Single RGB Image" : https://bit.ly/3phnLGpAUTHORSManuel Dahnert, Ji Hou, Matthias Niessner, Angela DaiABSTRACTIn recent years, neural implicit representations gained popularity in 3D reconstruction due to their expressiveness and flexibility. However, the implicit nature of neural implicit representations results in Richly segmented 3D scene reconstructions are an integral basis for many high-level scene understanding tasks, such as for robotics, motion planning, or augmented reality. Existing works in 3D perception from a single RGB image tend to focus on geometric reconstruction only, or geometric reconstruction with semantic segmentation or instance segmentation. Inspired by 2D panoptic segmentation, we propose to unify the tasks of geometric reconstruction, 3D semantic segmentation, and 3D instance segmentation into the task of panoptic 3D scene reconstruction -- from a single RGB image, predicting the complete geometric reconstruction of the scene in the camera frustum of the image, along with semantic and instance segmentations. We propose a new approach for holistic 3D scene understanding from a single RGB image which learns to lift and propagate 2D features from an input image to a 3D volumetric scene representation. Our panoptic 3D reconstruction metric evaluates both geometric reconstruction quality as well as panoptic segmentation. Our experiments demonstrate that our approach for panoptic 3D scene reconstruction outperforms alternative approaches for this taskRELATED PAPERS📚 Panoptic Segmentation: https://bit.ly/3vd1FZd📚MeshCNN: https://bit.ly/3M2lWH6📚Total3DUnderstanding: https://bit.ly/36yH9bfLINKS AND RESOURCES💻 Project Page: https://bit.ly/3JT2Dy1💻 CODE: https://github.com/xheon/panoptic-reconstruction🤐Paper's peer review: https://bit.ly/3Cij44tTo stay up to date with Manuel's latest research, check out his personal page and follow him on: 👨‍🎓Google Scholar: https://scholar.google.com/citations?user=eNypkO0AAAAJ🐦Twitter: https://twitter.com/manuel_dahnertCONTACTIf you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.comSUBSCRIBE AND FOLLOW🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikb...📧Subscribe to our mailing list: http://eepurl.com/hRznqb🐦Follow us on Twitter: https://twitter.com/talking_papers🎥YouTube Channel: https://bit.ly/3eQOgwP
In this episode of the Talking Papers Podcast, I hosted Songyou Peng to chat about his paper “Shape As Points: A Differentiable Poisson Solver”, published in NeurIPS 2021. In this paper, they take on the task of surface reconstruction and propose a hybrid representation that unifies explicit and implicit representation in addition to a differentiable solver for the classic Poisson surface reconstruction. I have been following Songyou's work for a while and was very surprised to discover that he is just about midway through his PhD (with so many good papers, I thought he is about to finish!). We first met online at the ICCV 2021 workshop on "Learning 3D Representations for Shape and Appearance" and I immediately flagged him as one of the next guests on the podcast.It was a pleasure recording this episode with him.AUTHORSSongyou Peng, Chiyu Jiang, Yiyi Liao, Michael Niemeyer, Marc Pollefeys, Andreas GeigerABSTRACTIn recent years, neural implicit representations gained popularity in 3D reconstruction due to their expressiveness and flexibility. However, the implicit nature of neural implicit representations results in slow inference time and requires careful initialization. In this paper, we revisit the classic yet ubiquitous point cloud representation and introduce a differentiable point-to-mesh layer using a differentiable formulation of Poisson Surface Reconstruction (PSR) that allows for a GPU-accelerated fast solution of the indicator function given an oriented point cloud. The differentiable PSR layer allows us to efficiently and differentiably bridge the explicit 3D point representation with the 3D mesh via the implicit indicator field, enabling end-to-end optimization of surface reconstruction metrics such as Chamfer distance. This duality between points and meshes hence allows us to represent shapes as oriented point clouds, which are explicit, lightweight and expressive. Compared to neural implicit representations, our Shape-As-Points (SAP) model is more interpretable, lightweight, and accelerates inference time by one order of magnitude. Compared to other explicit representations such as points, patches, and meshes, SAP produces topology-agnostic, watertight manifold surfaces. We demonstrate the effectiveness of SAP on the task of surface reconstruction from unoriented point clouds and learning-based reconstruction. RELATED PAPERS📚 Poisson Surface Reconstruction📚 Occupancy Networks📚 Convolutional Occupancy Networks LINKS AND RESOURCES💻 Project Page: https://pengsongyou.github.io/sap💻 CODE: https://github.com/autonomousvision/shape_as_points📚 Paper🤐Paper's peer reviewTo stay up to date with Songyou's latest research, check out his personal page and follow him on: 👨‍🎓 Google Scholar 🐦Twitter👨‍🎓LinkedInCONTACTIf you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.comSUBSCRIBE AND FOLLOW📧Subscribe to our mailing list: http://eepurl.com/hRznqb🐦Follow us on Twitter: https://twitter.com/talking_papers🎥YouTube Channel: https://bit.ly/3eQOgwPThis episode was recorded on January 22 2022.
Yicong Hong - VLN BERT

Yicong Hong - VLN BERT

2022-02-1722:57

PAPER TITLE:"VLN BERT:  A Recurrent Vision-and-Language BERT for Navigation" AUTHORS:  Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-Opazo, Stephen GouldABSTRACT:Accuracy of many visiolinguistic tasks has benefited significantly from the application of vision-and-language (V&L) BERT. However, its application for the task of vision and-language navigation (VLN) remains limited. One reason for this is the difficulty adapting the BERT architecture to the partially observable Markov decision process present in VLN, requiring history-dependent attention and decision making. In this paper we propose a recurrent BERT model that is time-aware for use in VLN. Specifically, we equip the BERT model with a recurrent function that maintains cross-modal state information for the agent. Through extensive experiments on R2R and REVERIE we demonstrate that our model can replace more complex encoder-decoder models to achieve state-of-the-art results. Moreover, our approach can be generalised to other transformer-based architectures, supports pre-training, and is capable of solving navigation and referring expression tasks simultaneously.CODE: 💻  https://github.com/YicongHong/Recurrent-VLN-BERTLINKS AND RESOURCES👱Yicong's pageRELATED PAPERS:📚 Attention is All You Need📚 Towards learning a generic agent for vision-and-language navigation via pre-trainingCONTACT:-----------------If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.comThis episode was recorded on April, 16th 2021.SUBSCRIBE AND FOLLOW:🎧Subscribe on your favourite podcast app:  https://talking.papers.podcast.itzikbs.com📧Subscribe to our mailing list: http://eepurl.com/hRznqb🐦Follow us on Twitter: https://twitter.com/talking_papers🎥YouTube Channel: https://bit.ly/3eQOgwP#talkingpapers #CVPR2021 #VLNBERT#VLN #VisionAndLanguageNavigation #VisionAndLanguage #machinelearning #deeplearning #AI #neuralnetworks #research #computervision #artificialintelligence 
PAPER TITLE Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural NetworksAUTHORSDespoina Paschalidou , Angelos Katharopoulos, Andreas Geiger, Sanja FidlerABSTRACTImpressive progress in 3D shape extraction led to representations that can capture object geometries with high fidelity. In parallel, primitive-based methods seek to represent objects as semantically consistent part arrangements. However, due to the simplicity of existing primitive representations, these methods fail to accurately reconstruct 3D shapes using a small number of primitives/parts. We address the trade-off between reconstruction quality and number of parts with Neural Parts, a novel 3D primitive representation that defines primitives using an Invertible Neural Network (INN) which implements homeomorphic mappings between a sphere and the target object. The INN allows us to compute the inverse mapping of the homomorphism, which in turn, enables the efficient computation of both the implicit surface function of a primitive and its mesh, without any additional post-processing. Our model learns to parse 3D objects into semantically consistent part arrangements without any part-level supervision. Evaluations on ShapeNet, D-FAUST and FreiHAND demonstrate that our primitives can capture complex geometries and thus simultaneously achieve geometrically accurate as well as interpretable reconstructions using an order of magnitude fewer primitives than state-of-the-art shape abstraction methods.RELATED PAPERS📚 "KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control"📚 "Learning Shape Abstractions by Assembling Volumetric Primitives": Volumetric primitives"📚 "Superquadrics Revisited: Learning 3D Shape Parsing beyond Cuboids"📚 "CvxNet: Learnable Convex Decomposition"📚 "Neural Star Domain as Primitive Representation"LINKS AND RESOURCES💻 Project Page: https://paschalidoud.github.io/neural_parts💻 CODE: https://github.com/paschalidoud/neural_parts💻Blog Post CONTACTIf you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.comSUBSCRIBE AND FOLLOW 🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikbs.com 📧Subscribe to our mailing list: http://eepurl.com/hRznqb 🐦Follow us on Twitter: https://twitter.com/talking_papers🎥YouTube Channel: https://bit.ly/3eQOgwPThis episode was recorded on April, 25th 2021.
Guy Gafni - NerFACE

Guy Gafni - NerFACE

2022-02-0332:44

PAPER TITLE:Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar ReconstructionAUTHORS:  Guy Gafni      Justus Thies      Michael Zollhöfer     Matthias Nießner    Project page: https://gafniguy.github.io/4D-Facial-Avatars/CODE: 💻https://github.com/gafniguy/4D-Facial-AvatarsABSTRACT:We present dynamic neural radiance fields for modeling the appearance and dynamics of a human face. Digitally modeling and reconstructing a talking human is a key building-block for a variety of applications. Especially, for telepresence applications in AR or VR, a faithful reproduction of the appearance including novel viewpoint or head-poses is required. In contrast to state-of-the-art approaches that model the geometry and material properties explicitly, or are purely image-based, we introduce an implicit representation of the head based on scene representation networks. To handle the dynamics of the face, we combine our scene representation network with a low-dimensional morphable model which provides explicit control over pose and expressions. We use volumetric rendering to generate images from this hybrid representation and demonstrate that such a dynamic neural scene representation can be learned from monocular input data only, without the need of a specialized capture setup. In our experiments, we show that this learned volumetric representation allows for photo-realistic image generation that surpasses the quality of state-of-the-art video-based reenactment methods.RELATED PAPERS:📚Representing Scenes as Neural Radiance Fields for View Synthesis📚Deep Video Portraits📚Nerfies: Deformable Neural Radiance Fields📚AD-NeRF: Audio Driven Neural Radiance Fields for Talking Head SynthesisCONTACT:-----------------If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.comTIME STAMPS-----------------------00:00   00:07 Intro00:27 Authors01:16 Abstract / TLDR02:54 Motivation12:24 Related Work13:20 Approach17:10 Results27:05 Conclusions and future work32:12 Outro#talkingpapers #CVPR2021 #NeRF#machinelearning #deeplearning #AI #neuralnetworks #research #computervision #artificialintelligence #FacialAvatarsRecorded on April, 2nd 2021.SUBSCRIBE AND FOLLOW:🎧Subscribe on your favourite podcast app:  https://talking.papers.podcast.itzikbs.com📧Subscribe to our mailing list: http://eepurl.com/hRznqb🐦Follow us on Twitter: https://twitter.com/talking_papers🎥YouTube Channel: https://bit.ly/3eQOgwP
Jing Zhang - UC-Net

Jing Zhang - UC-Net

2022-01-2030:21

PAPER TITLE:"UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders"AUTHORS:Jing Zhang, Deng-Ping Fan, Yuchao Dai, Saeed Anwar, Fatemeh Sadat Saleh, Tong Zhang, Nick BarnesABSTRACT:In this paper, we propose the first framework (UCNet) to employ uncertainty for RGB-D saliency detection by learning from the data labeling process. Existing RGB-D saliency detection methods treat the saliency detection task as a point estimation problem, and produce a single saliency map following a deterministic learning pipeline. Inspired by the saliency data labeling process, we propose probabilistic RGB-D saliency detection network via conditional variational autoencoders to model human annotation uncertainty and generate multiple saliency maps for each input image by sampling in the latent space. With the proposed saliency consensus process, we are able to generate an accurate saliency map based on these multiple predictions. Quantitative and qualitative evaluations on six challenging benchmark datasets against 18 competing algorithms demonstrate the effectiveness of our approach in learning the distribution of saliency maps, leading to a new state-of-the-art in RGB-D saliency detection.💻SUBSCRIBE AND FOLLOW:🎧Subscribe on your favourite podcast app:  https://talking.papers.podcast.itzikbs.com📧Subscribe to our mailing list: http://eepurl.com/hRznqb🐦Follow us on Twitter: https://twitter.com/talking_papers🎥YouTube Channel: https://bit.ly/3eQOgwPCODE:💻https://github.com/JingZhang617/UCNetRELATED PAPERS:📚A probabilistic u-net for segmentation of ambiguous images 📚Learning structured output representation using deep conditional generative modelsCONTACT:-----------------If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.comTIME STAMPS-----------------------00:00 |  00:02 |  Intro00:31 |  The Authors01:07 |  Abstract / TLDR02:41 |  Motivation07:18 |  Related Work09:20 |  Approach18:32 |  Results24:04 |  Conclusions and future work25:42 |  What did reviewer 2 say?29:49 |  Outro#talkingpapers #CVPR2020 #RGBDSaliency#machinelearning #deeplearning #AI #neuralnetworks #research #computervision #artificialintelligence
PAPER TITLE:"Deep Declarative Networks: a new hope"AUTHORS:Stephen Gould, Richard Hartley, Dylan CampbellABSTRACT:We explore a new class of end-to-end learnable models wherein data processing nodes (or network layers) are defined in terms of desired behaviour rather than an explicit forward function. Specifically, the forward function is implicitly defined as the solution to a mathematical optimization problem. Consistent with nomenclature in the programming languages community, we name these models deep declarative networks. Importantly, we show that the class of deep declarative networks subsumes current deep learning models. Moreover, invoking the implicit function theorem, we show how gradients can be back-propagated through many declaratively defined data processing nodes thereby enabling end-to-end learning. We show how these declarative processing nodes can be implemented in the popular PyTorch deep learning software library allowing declarative and imperative nodes to co-exist within the same network. We also provide numerous insights and illustrative examples of declarative nodes and demonstrate their application for image and point cloud classification tasks.💻SUBSCRIBE AND FOLLOW:🎧Subscribe on your favourite podcast app:  https://talking.papers.podcast.itzikbs.com📧Subscribe to our mailing list: http://eepurl.com/hRznqb🐦Follow us on Twitter: https://twitter.com/talking_papers🎥YouTube Channel: https://bit.ly/3eQOgwPTUTORIALS AND WORKSHOPS:ECCV 2020 Tutorial CVPR 2020 WorkshopCODE:💻Codebase 💻Jupiter notebooksPAPER: "Deep Declarative Networks: a new hope" Preprint"Deep Declarative Networks"RELATED PAPERS:📚"On differentiating parameterized argmin and argmax problems with application to bi-level optimization" 📚"OptNet: Differentiable Optimization as a Layer in Neural Networks" : CONTACT:-----------------If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.com#talkingpapers #TPAMI2021 #deepdeclarativenetworks#machinelearning #deeplearning #AI #neuralnetworks #research #computervision #artificialintelligence Recorded on March, 31th 2021.
Paper title: "DORi: Discovering Object Relationships for Moment Localization of a Natural Language Query in a Video"Authors:  Cristian Rodriguez-Opazo, Edison Marrese-Taylor, Basura Fernando, Hongdong Li, Stephen GouldAbstract: This paper studies the task of temporal moment localization in a long untrimmed video using natural language query. Given a query sentence, the goal is to determine the start and end of the relevant segment within the video. Our key innovation is to learn a video feature embedding through a language-conditioned message-passing algorithm suitable for temporal moment localization which captures the relationships between humans, objects and activities in the video. These relationships are obtained by a spatial subgraph that contextualized the scene representation using detected objects and human features. Moreover, a temporal sub-graph captures the activities within the video through time. Our method is evaluated on three standard benchmark datasets, and we also introduce YouCook II as a new benchmark for this task. Experiments show our method outperforms state-of-the-art methods on these datasets, confirming the effectiveness of our approachRESOURCES-----------------Cristian's page: https://crodriguezo.github.io/Code: https://github.com/crodriguezo/DORiRelated papers:"Proposal free temporal moment localization" : https://bit.ly/3EX1qCM"Action Genome: Actions As Compositions of Spatio-Temporal Scene Graphs" :  https://bit.ly/3zt4aXASubscribe to the podcast:  https://talking.papers.podcast.itzikbs.comSubscribe to our mailing list: http://eepurl.com/hRznqbFollow us on Twitter: https://twitter.com/talking_papersYouTube Channel: https://bit.ly/3eQOgwPCONTACT:-----------------If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.comRecorded on March, 26th 2021.
Comments 
Download from Google Play
Download from App Store