The eyes of machines: Marc Pollefeys on HoloLens 2 and Computer Vision.
Marc Pollefeys is a visionary. The professor of Computer Science at ETH Zurich is carrying out intensive research into how computer-based systems can learn to "see" better, more technically known as Computer Vision. He also sets great store on using science to solve real-life challenges. Marc Pollefeys spent two years codeveloping HoloLens 2 at the Microsoft headquarters in Redmond, USA. We caught up with him at the opening of the new Mixed Reality & AI Lab he’s heading up in Zurich.

Share article

Jetzt bei Bechtle verfügbar: HoloLens 2 mit Dynamics 365 Remote Assist.


What is Computer Vision?

We mostly use our eyes to extract information and interact with the world. In Computer Vision, we develop algorithms so computers can extract similar information from the world. In my group, we focus on extracting geometric information, combining that with semantic information in a way that robots and other devices can understand the world.


What has been the greatest progress in recent years?

In the last few years, there’ve been two major areas in Computer Vision. One is geometric Computer Vision— an area my group is focused on. It’s an area that’s made a lot of progress for many years, but in the last few years things have really come together. There are robust solutions and things that really work are being used. You see results in self-driving cars, in mixed reality, in robotics.


An area that we had been struggling with is extracting semantic information from images. That’s been revolutionised through deep neural networks, AI and convolutional neural networks, which work particularly well on image data. There’s been a ton of progress in that space. Bringing these two together, we can now address many real-world applications.


As a result, the industry is really picking up on Computer Vision. Attendance at our conferences has increased by a factor of five. So there’s a huge amount of interest and investment in many areas of Computer Vision because it can now actually solve real-world problems.


What makes the HoloLens 2 better than its predecessor?

First, HoloLens 2 is way more comfortable. You can just put it on and go in and out of an immersive experience. But it is also able to understand a lot more of what’s going on. In particular, the depth camera can be used not only to "click" things, it actually tracks all your fingers so you can intuitively manipulate digital information as you would do with real-world objects.


There are also two small cameras that look at your eyes. This lets you verify your identity with biometric authentication. It knows that it’s you wearing the device, so you can access your information. It can also track your eyes so we can feed the optimal imagery to your eyes and also optimise microphone arrays, making it perfect for speech. When I put it on, I have mics that pick up the environment, but I also have two mics very close to my mouth, so if I whisper, even in a large factory, the device understands me.


One key benefit of the new device is that it’s more immersive. The display—the part of the world that is overlayed with digital information, is much bigger now. It’s more than twice as big as the first generation’s. You get much more natural interaction with the world because more of it is augmented.


What must AR devices be able to do in the future to attract the mass market?

There are two things. I think HoloLens 2 is already ready for large markets, but in a set of specific scenarios where you put it on and do a task—when you’re done you put it down again. That’s the current state of this device. Going forward, we will be able to shrink it further, make it more immersive. Then it will become just natural to wear it all the time. Essentially, if we can get to a glasses form factor, you’ll wear it all the time. At that point, we’ll actually forget it’s there.


Today we can’t live without smartphones. Five, ten years from now, we’ll all wear glasses that put information before our eyes. We’ll always have access to the information we need to do whatever it is we do. It will all be very intuitive and natural, and everyone will have this type of device.


What we’re looking at is going from individual to "in-the-world" experiences. Essentially, we’re going towards shared mixed-reality experiences. To do that, we need many devices that are connected to the cloud. The cloud plays a critical role in combination with the devices. Smart edge devices and smart cloud together can provide these experiences.

Contact person.

Bechtle update editorial team





Get the best from the Bechtle update every two months directly into your mailbox. Click here to register:




Published on Nov 5, 2019.