Supermarket websites hold a treasure of information about products pricing. Marketing research companies have built systems to scrap this information from the Web, but still rely on human operators to process this data in order to feed their database.
We developed an application that automatically processes the scrapped product information. The automatic processing can be verified by a human operator to ensure quality. This application has divided the manual processing cost by a factor of 10 and cut the overall spending for scrap processing in half. It also allows for real-time processing and has improved quality by dividing the number of errors by 3.
This application handles the following tasks, these were all previously done by humans:
- Product category detection : This task is very time consuming for a human operator because there can a very large number of categories to choose from, sometimes several thousands.
- Brand detection
- Product matching : When products don’t have an associated EAN/GTIN, it is hard to recognize that a product is identical to one previously seen as there exist millions of different products. This matching is crucial for marketing analysis. Our machine learning algorithms learned to distill the operator’s expertise.
Each information our system outputs is associated to a confidence score, this allows to determine which information should be validated by a human operator and which can considered correct without validation.
Leaflets are analyzed by human operators to give manufacturers, brands and distributors data on the promotional activities of every retailer. This manual processing is expensive, not fast enough and of variable quality.
We developed an application that automatically handles leaflets. This automatic processing can be verified by operators to ensures high quality. This application divides the cost of manual processing by 10 and cuts the total cost in half. It allows to process leaflets as fast as they come and improves the data quality by dividing the amount of errors by 3.
On a page, our application segments each part corresponding to a different promotion. If a promotion displays multiple products (a kitchen with multiple elements for example), each product is detected. Promotional text is recognized, even we employing fonts that standard OCR can’t handle. Once processing is done, each promotion is detected and has its information extracted and associated to one or multiple products in the marketing database. To each extracted information is associated a confidence score, allowing to determine which information should be validated by a human operator and which can be considered correct without validation.
A meta-platform deals with the diffusion of products on a large number of marketplaces, allowing distributors to increase their sales.
Historically, this meta-platform has employed large teams of operators to describe the products to the format of each marketplace. They are forced to manage the tradeoff between description costs and expected profits from sales, seriously limiting their revenue.
We designed an application that automatically generate product features, according to each marketplace format, from the distributor’s original description. This application allows for the distribution of a large number of products on virtually every marketplace. The data processing cost scales inversely with volume, encouraging the meta-platform to process a lot of products, thus increasing their revenue.
Every consumer has one day struggled to find a product in a large store. They are then faced with another challenge: finding a store clerk, and then waiting for the clerk to be available. Combined with a climate of staff reduction aggravating this problem, consumer are often more willing to buy online rather than in physical stores.
We designed a smartphone voice assistant that guides consumers in search of a product. The assistant makes the consumer’s life easier and encourages him to reconsider physical outlets. It also frees up time for retail personnel, allowing in them to focus on their advising role.
For now our assistant has been designed with big-box stores in mind, but it could be adapted to various domains where users struggle to orient themselves: airports, hospitals, train stations, etc.
Currently our system processes the dialog between the assistant and user, we are developing a location system based on image recognition.
The construction industry is going through a digital revolution, in particular with Building Information Models aiming at replacing traditional prints with a digital twin of the building. BIM will disrupt every construction process in order to increase the sector’s productivity by 50% to 60%. These digital twins are essential to the construction process, but even more so to the maintenance process which represent 80% of the total costs of this sector. There is thus a tremendous need for the creation of the digital twins of existing buildings. Currently, these twins are made by operators that retro-engineer from point clouds captured with the use of lidar. This is a very labor intensive task, encouraging outsourcing to countries with cheaper labor costs. Such solutions aren’t satisfactory, the twins needs to be completed and verified on site. Multiple actors have abandoned outsourcing and have put qualified operators, that know the site, in charge of this retro-engineering work.
In association with a large design office, and leveraging the latest advances in 3D semantic segmentation, we are developing a prototype to automatically build a draft version of the digital twin directly from the point cloud capture. This technique, coupled with other (2D plans reading, etc.) aims at drastically reducing the work that needs to be done by specialized operators and make significant productivity gains.
This technique can also be used to certify a BIM model by comparing it to a capture of the building. Even if this sector has less economic potential it is nonetheless crucial to the development of BIM. Currently, this certification is done by human operator assisted by augmented reality. For this use case, automation could be considered. Our technology is able to automatically flag discrepancies between the BIM model and reality, these discrepancies can then be verified by a human operator.
For a company in charge of road maintenance, we built a system that detects damaged road equipments to facilitate their maintenance.
This company has a fleet of vehicle that record the road, our system analyzes the video feeds and detects road equipments, their position and estimates their level of wear.
This allows to determine which equipments need to be maintained and the efficient planification of maintenance missions.
Thanks to this system, the operator can ensure an optimal maintenance and avoids any litigation due to degraded signaling.
In the construction sector, contractual documents play an important role from both a technical and legal standpoint. It is thus important to manage them well so that every actor can access them efficiently.
We built an application that automatically classifies these documents, even if they take the form of images coming from a scanner.
Before our system was in use the department that manage these documents would often make errors and misclassify them, and the whole process took a lot of time from them.
Now the process has become more efficient, errors tend to be very rare, and the department can focus on higher value tasks.
Scientific experiments can be very costly and time consuming, it is thus very important to choose the right ones to carry out. We know of two instances in the chemistry sector where the compounds themselves are expensive and where there is a very large number of compounds that could be tested, which makes finding the most suitable ones impractical.
From a database of experimental results on a limited amount of compounds, around a hundred, we have develop a predictive model that allows to predict the experimental results on thousands of compounds. This allows the experimenters to test only the most promising compounds.
When experts would select which experiments to carry out, only 50% had a positive outcome, with our system, 90% of experiments are positive.
Our application allowed the budget for experiments to be cut in half. More importantly, it has allowed rapid creation of database of compounds that can be of use to our client which was their primary goal.
Long term we even hope to forego real experiments and replace them with virtual ones only.
CAMs are electronic modules with a “smartcard” form factor, these devices can be embedded in most modern TVs to allow access to subscription channels.
Unlike set-top boxes, one cannot capture the video signal outputted by a CAM. Currently, to test these devices an operator has to watch the TV while running test programs to make sure the image corresponds to the signal: video, black screen, menus, error pop-ups, etc.
We developed a program that analyzes the images from a recording of the screen, classifies them and extracts text from it to compare it to the expected behavior. One of the strengths of this program is being robust to different lighting conditions, allowing its use in multiple test settings.
In the MORIA project we showed how virtual experiments are used to predict the outcome of real ones.
To build these virtual experiments, our algorithms need a dataset of past experiments and their outcomes. The more complex the experiments are, the more data tends to be needed. One way to deal with insufficient data is to look for these experimental results in the scientific literature, to this end we developed the ScientificReaderDigest project.
This project hinges on three tasks:
- Finding publications that contain relevant experimental results.
- Extract the relevant parts from these publications.
- From these parts, extract the experimental results to populate a database.
Accomplishing suck tasks requires leveraging a complex semantic space that can only be extracted from annotated examples. Building up these annotations is time consuming and requires expert operators. We thus built an incremental system that can be bootstrapped from keywords relevant to the targeted semantic space.
These keywords allows us to generate queries to look for a first batch of documents. They also help finding relevant parts in the document, these parts can then be annotated by a operators. These annotations help the system refine itself, up to a point where it itself can start extracting information, the operator then only need to validate the output if the system isn’t confident in its work. The system learn continuously, reducing the need for human intervention and increasing the share of automation.
There exists a variety of files list large number of individuals, searching for someone in such a file raises a number of challenges: homonyms, imprecise data, missing informations, etc.
To overcome such difficulties, operators have developed an expertise that allow them to find the information they are looking for.
We have developed an application that allows for the automation of such a task.
For searches where our application is very confident, amounting to around 85% of total of searches, the automatic system is 3 times less likely to make errors than an operator.
To reduce costs, the most confident searches are automated and we provided assistance to the operator on the remaining cases. This allows for a cost reduction of 90% and cuts errors to a third of what they were.
Our client is mostly concerned about quality, to that end we assist the operator on every search, combining human and artificial intelligence. Costs are cut in half and the error rate is divided by ten.
Our system is an overhaul of an existing system. The system extract information from free-form text, when the system is confident in the extraction, the information is directly inserted in the database, when the uncertainty is too high a human operator validates it.
To make the operator's work easier the document is automatically structured, making navigation and and search easier. The source text of any extracted information is presented to the operator, facilitating the validation.
The previous system, which has been used for many years, is a rule-based system. It has proved very difficult to maintain and its performance is disappointing. Our system leverages machine learning and learns from the operators' work.
The new system already performs better than the previous one, but it will only get better as it captures more of the operators' knowledge. To this end the new system directly integrates an extensive annotions tool.
A startup created an mobile app to display the waiting time until the next bus on traditional bus stops. You just have to aim your device at the bus stop and it is overlaid with a dynamic display. This app allows public transportation companies to improve their service without having to invest in new displays, in particular in the less dense areas.
We have developed an algorithm that detects bus stops in a video feed and runs on-device (both iOS and Android). Once a bus stop is detected, its text is extracted with OCR to recognize which stop it is, the corresponding waiting times are then displayed.
Following the implementation of the Product2MarketPlace system, our client experienced an increase in sales volume which overloaded his customer service. Automation allowed them to meet the demand without having to increasing the size of their team.
We built an email classification system with the aim of automating replies. Thanks to this system our client has been able to keep their needs in outsourced operators at a minimum while improving response times. Only the most complex cases are still handled by their internal team, which didn’t need to additional staff despite the increases in sales.
© Aiway 2019