Discovertagging.techTagging.tech interview with Nicolas Loeillot
Tagging.tech interview with Nicolas Loeillot

Tagging.tech interview with Nicolas Loeillot

Update: 2016-03-22
Share

Description

Tagging.tech presents an audio interview with Nicolas Loeillot about image recognition



 


Listen and subscribe to Tagging.tech on Apple PodcastsAudioBoom, CastBox, Google Play, RadioPublic or TuneIn.


Keywording_Now.jpg


Keywording Now: Practical Advice on using Image Recognition and Keywording Services


Now available


keywordingnow.com


 


Transcript:


 


Henrik de Gyor:  This is Tagging.Tech. I’m Henrik de Gyor. Today, I’m speaking with Nicolas Loeillot. Nicolas, how are you?


Nicolas Loeillot:  Hi, Henrik. Very well, and you?


Henrik:  Great. Nicolas, who are you, and what do you do?


Nicolas:  I’m the founder of a company which is called LM3Labs. This is a company that is entering into its 14th year of existence. It was created in 2003, and we are based in Tokyo, in Singapore, and in Sophia Antipolis in South France.


We develop computer vision algorithm software, and sometimes hardware. Instead of focusing on some traditional markets for this kind of technology, like military or security and these kind of things, we decided to focus on some more fun markets, like education, museums, entertainment, marketing.


What we do is to develop unique technologies based on computer vision systems. Initially, we are born from the CNRS, which is the largest laboratory in France. We had some first patents for triangulations of finger in the 3D space, so we could very accurately find fingers a few meters away from the camera, and to use these fingers for interacting with large screens.


We thought that it would be a good match with large projections or large screens, so we decided to go to Japan and to meet video projector makers like Epson, Mitsubishi, and others. We presented the patent, just the paper, [laughs] explaining the opportunity for them, but nobody understood what would be the future of gesture interaction.


Everybody was saying, “OK, what is it for? There is no market for this kind of technology, and the customers are not asking for this.” That’s a very Japanese way to approach the market.


The very last week of our stay in Japan, we met with NTT DoCoMo, and they said, “Oh, yeah. That’s very interesting. It looks like Minority Report, and we could use this technology in our new showroom. If you can make a product from your beautiful patent, then we can be your first customer, and you can stay in Japan and everything.”


We went back to France. We met the electronics for supporting their technology. Of course, some pilots were already written, so we went back to NTT DoCoMo, and we installed them in February 2004.


From that, NTT DoCoMo introduced us to many big companies, like NEC, DMP, and some others in Japan, and they all came with different type of request. “OK. You track the fingers, but can you track the body motion? Can you track the gestures? Can you track the eyes, the face, the motions and everything?”


We made a strong evolution of the portfolio with something like 12 products today, which are all computer vision‑related, which are usually pretty unique in their domain, even if we have seen some big competitors like Microsoft [laughs] on our market.


In 2011, we were the first to see the first deployment of 4G networks in Japan, and we said, “OK. What do we do with the 4G? That’s very interesting, very large broadband, excellent response times and everything. What can we do?”


It was very interesting. We could do what we couldn’t do before, which is to put the algorithm on the cloud and to use it on the smartphone, because the smartphone were becoming very smart. It was just beginning of the smartphones at the time, with the iPhone 4S, which was the first one which was really capable of something.


We started to develop Xloudia, which is today one of our lead products. Xloudia is mass recognition of images, products, colors, faces and everything from the cloud, and in 200 milliseconds. It goes super fast, and we search in very large databases. We can have millions of items in the base, and we can find the object or the specific item in 200 milliseconds.


Typically, applying the technology to augmented reality, which was done far before us, we said, “OK. The image recognition can be applied to something which is maybe less fun than the augmented reality, but much more useful, which is the recognition of everything.”


You just point your smartphone to any type of object, or people, or colors, or clothes, or anything, and we recognize it. This can be done with the algorithm, with the image recognition and the video recognition. That’s a key point, but not only with these kind of algorithms.


We need to develop some deep learning recognition algorithm for finding some proximities, some similarities, and to offer the users more capabilities than saying, “Yes, this is it,” or, “No, this is not it.” [laughs]


We focus on this angle, which is, OK. Computer vision is under control. We know our job, but we need to push the R&D into something which is more on the distribution of the search on the network ought to go very fast. That’s the key point. The key point was going super fast, because for the user experience, it’s absolutely momentary.


On the other hand is, “If we don’t find exactly what is searched by the user, how can we find something which is similar or close to what they are looking for?” There is an understanding of the search, which is just far beyond the database that we have in catalog, and just to make some links between the search and the environment of the users.


The other thing that we focus on was actually the user experience. For us, it was absolutely critical that the people don’t press any button for finding something. They just have to use their smartphone, to point it to the object or to the page, or to the clothes, or anything that they want to search, and the search is instantaneous, so there is no other action.


There is no picture to take. There is no capture. There is no sending anything. It’s just capturing in real time from the video flow of the smartphone, directly understanding what is passing in front of the smartphone. That was our focus.


<p class="graf graf--p graf-after--p" id="
Comments 
00:00
00:00
x

0.5x

0.8x

1.0x

1.25x

1.5x

2.0x

3.0x

Sleep Timer

Off

End of Episode

5 Minutes

10 Minutes

15 Minutes

30 Minutes

45 Minutes

60 Minutes

120 Minutes

Tagging.tech interview with Nicolas Loeillot

Tagging.tech interview with Nicolas Loeillot

Henrik de Gyor