Tagging.tech interview with Ramzi Rizk
Tagging.tech presents an audio interview with Ramzi Rizk
Henrik: This is Tagging.tech. I’m Henrik de Gyor. Today, I’m speaking with Ramzi Rizk. Ramzi, how are you?
Ramzi: Hey Henrik, how are you? I’m good thanks.
Henrik: Great. Ramzi, who are you and what do you do?
Ramzi: I’m one of the founders and I’m the CTO at a company called, EyeEm.com. Based out of Berlin, we’re a photography company, been around for 5 and a half years now, where we’re a community and market-based for authentic imagery. Basically, photos taken by average people who have a passion for photography, but aren’t necessarily professionals. Over the past few years, we’ve invested a lot and built quite a few technologies around understanding the content context and aesthetic qualities of images.
Henrik: Great. What are the biggest challenges and successes with image recognition?
Ramzi: I think over the past few years there’s been an amazing explosion in the number of tools that are available, particularly out of deep learning that are available to actually automate a big part of the photographers’ workflow, if you want. That includes, of course, recognizing what is in a photo, as well as, was the quality of the photo are and making photos just that much easier to find, to search and to share. I think the greatest successes have been naturally the fact that we’re at a point now where we can, better than human accuracy, I would say, describe the content of a photo. A lot of the challenges would have to be around data. Deep learning is a very data-heavy field and that you need a lot of content that is properly labeled, properly tagged, in order to train these machines to recognize what’s in the images.
Over the past few years it’s gotten, things have gotten more and more accurate to the point where, in a lot of cases, machines are actually more accurate than humans at recognizing the various details in a photo. That being said, we as humans do have this innate ability to understand context and to draw the more subtle abstract notions of what an image is trying to compare and that is definitely significantly more challenging to model in a machine.
Henrik: As of October 2016, how do you see image recognition changing?
Ramzi: I think we’re getting to a point where the pure art of recognizing what is in a photo has become a commodity, I would say. In the next 6 months to a year, you should be able to just license a variety of APIs and Google has an API out, so do we, so does a few other companies that are specialized at understanding the content of a photo. I think image recognition in a classical sense, how we understand it. When you think 10 years ago we were talking about how amazing it is that we can now recognize cats in videos. I think that challenge is one that is solved and since it’s now a solved problem, we will be seeing, and we are seeing a lot of applications built on top of this, doing this that were previously not that possible.
That includes also having the ability to run these so-called models, these algorithms on your device, on your phone, and not having to upload content to the cloud, even in real time. Which means we’re at a point now where while you’re taking a photo, you can actually be getting real-time feedback on the quality of the image, on whether the photo that you’re taking is actually aesthetic appealing and the minute you shoot it, your phone has already stored all of the content of that photo, making it searchable right away.
Henrik: Ramzi, what advice would you like to share with people, looking into image recognition?
Ramzi: People looking into building image recognition solutions, I would recommend not to anymore, because as I said, the problem is solved. You don’t reinvent email, you build services on top of it, and I think today you’re at a point where you can build a lot of really exciting, interesting services on top of existing image recognition frameworks and existing APIs that offer this out of the box. For people looking at using it, I think this is the perfect time to actually start building these applications because technology is mature enough, it’s more than affordable, and it’s at a point where anyone can really build software, with the assumption that they understand what is in the photo.
Henrik: Where can we find out more information?
Ramzi: I would definitely have to pitch, eyeem.com/tech. If you’re interested in looking at applied image recognition. We offer an API where you can actually keyword your entire content, your entire image library for photography professionals or for amateurs. You can also have it caption or have images described in a full sentence, even more interesting is machines that have learned to now understand your personal taste. They can actually surface content that you know you will like, or surface content that you know your customers will like or that your significant other would like and then just simplify that entire process of really taking out the monotonous, boring work out of photography, out of photographers workflow.
As a photographer, you can just focus on the art of creation and on capturing that perfect moment. I think there’s a bunch of other services like Google Cloud Vision and so on, that you can also look at and learn more about what you can do with imagery today.
Henrik: Thanks Ramzi.
Ramzi: Thank you, Henrik. Pleasure speaking to you.
Henrik: For more of this, visit Tagging.tech.
For a book about this, visit keywordingnow.com