Natural Language Processing Systems which can be used to find recommendations.
Business overview
Location | Westcliff-on-Sea, United Kingdom |
---|---|
Social media |
|
Website | www.senym.com |
Sectors | Content & Information Digital B2C |
Company number | 08488795 |
Incorporation date | 15 Apr 2013 |
Idea
Introduction
The central idea is to produce a recommendation engine based on Natural Language Processing (NLP) methods. The recommendation engine would operate behind a website or a user service and feedback to the site details of what users are saying. This could be done automatically using a computer system.
The system would learn what was said using empirical methods and control system methods. The engine would then be able to extract for meaning which could be used for commercial use or public use (for clients in the public sector).
The engine would be usable behind different types of websites giving rise to wide applications.
This opportunity has arisen now because we now have Big Data and a large volume of data would make a recommendation engine more effective as training sets can be produced.
Intended impact
The impact of Senym will be to help commercialise the heavy volume of Internet communication traffic and to provide a cost effective means of assisting security.
In essence, Senym is a means of interpreting language and specifically finding the object of a sentence. Since 2011, the growth in data exchange has been so extensive that a new name called Big Data has been given. This data now accounts for 90% of all the data held (offline and online) and that data was created in just two years.
This huge repository of data takes of form of billions of emails, tweets, facebook messages, web pages, forums messages, blogs, as well as other forms. There are 144 billion emails alone every day and 2.7 billion Facebook likes.
This data currently is being commercialised but there is plenty of scope for improvement. Senym would allow social media platforms to understand when recommendations are made, when certain facts are exchanged or requested (eg recipes), and when people are seeking recommendations.
Senym could also be applied as a spam filter or in security systems which read language (eg if someone is being monitored).
Other applications could include processing of transcripts, understanding ebook books, assisting in video search (once video have been transcribed).
If an application uses language, then Senym could be used.
Substantial accomplishments to date
To date a working demo has been built and implemented at Senym.com.
This demo was designed only to show the technologies could in principle handle the data. I will outline what was done, and its importance. A C#., .NET solution was implemented to collect Twitter data for testing.
This data was passed to a Python Natural Language Toolkit (NLTK) which identified 600 sentence objects in a sample of 3000 tweets. The data was then loaded into a Neo4j graph and visually shown using Arbor which renders graphs in Javascript. The system uses Linux. Aside from C#. and .NET all the software is free for development.
The NLTK is known to be slow and would be replaced by Java in a full prototype. The NLTK now writes files and that would be replaced to a key value store such as Hadoop for a full prototype.
The technologies demonstrated that data could be extracted from social media, classified, identified, and then loaded into a graph. This is an essential step in proof of concept.
Monetisation strategy
The fundamental business model is to sell the IP rights. The 10K requested is only to build a prototype and confirm proof of concept. Angel Funding and VC Funding would be needed to build a full application. This is estimated to be 700K. However, the potential value of an NLP system would be many times 700K.
Supporting evidence for the potential monetisation of a Natural Language Processing system may be found in the announcement that 1.4 Million Dollars was invested in Idibon which employs 6 people which "creates scalable languages technologies, supporting the natural language processing of enterprise organizations”. The distinction between Senym and other NLP system providers is that Senym focuses on recommendations. Therefore, Senym systems when developed would lend themselves more to applications linking products and comments. If IP rights could not be sold, then we think the recommendation engine could be licensed. But this is seen as a second choice in terms of moving forward.
Subsequent investment would be raised in two rounds with the bulk of the funding in the final and VC funding round (500K).
It is estimated after the proof of concept is established, that 200K of Angel Funding would be required to build a Big Data prototype which could handle a million records.
These sums are needed to hire the required PhD. qualified staff to work on this venture.
Use of proceeds
The 10K is required to establish proof of concept and to build a prototype which could work on 10,000 records and correctly identify 50% of them in terms of an object.
A new PC would be needed and two developers would work on the venture for 3 months. At the end of this period, the system would have the following solution: data collection -> NLTK --> Key Value Store --> Graph.
Depending on the level of difficulty, there could some progress in replacing the NLTK with Java but that would really part of the next phase of the venture related to a Big Data prototype.
Market
Target market
The central idea is to produce a recommendation engine based on Natural Language Processing (NLP) methods. The recommendation engine would operate behind a website or a user service and feedback to the site details of what users are saying. This could be done automatically using a computer system.
The system would learn what was said using empirical methods and control system methods. The engine would then be able to extract for meaning which could be used for commercial use or public use (for clients in the public sector).
The engine would be usable behind different types of websites giving rise to wide applications.
This opportunity has arisen now because we now have Big Data and a large volume of data would make a recommendation engine more effective as training sets can be produced.
Characteristics of target market
The market is very large and the target companies routinely buy small venture based companies for their IP rights and/or user database.
They have very high volumes of traffic and very large databases filled with content. We think they also poorly convert the traffic and content into revenues. This is due to them failing to interpret the user traffic correctly. This creates a very big opportunity.
Evidence for their efficiencies can be found in the very high profit/earnings ratios, and the very high failure rates in the sector.
In terms of size, all the statistics indicate that the Internet market is a largest market in the world, but it also under performs for the reasons given.
Marketing strategy
The acquisition market is one which can be reached by using the appropriate contacts. It is a difficult one to penetrate and the acquiring company will look for significant IP rights and uniqueness.
The market can most effectively be reached via VC fund manager contacts, university contacts, and appropriate networking.
The acquisition market is one which can be reached by using the appropriate contacts. It is a difficult one to penetrate and the acquiring company will look for significant IP rights and uniqueness.
The market can most effectively be reached via VC fund manager contacts, university contacts, and appropriate networking.
Competition strategy
In our view, the main reason why competitors have failed in NLP is that they have taken a computational linguistic approach in which they try and analyse the text and produce a calculation or probability for its expected meaning.
The approach recommended in Senym is a control systems approach which is not usually used in NLP in the computational field. The difference is that a control systems approach will tackle the problem from a decision making aspect and not from a classification and processing aspect. We think that many computer science graduates have training in processing data but little training in the more general approach used in subjects such as Physics, and little in the more approximate and estimation approaches used in control engineering. Control problems tend to have so many variables that an analytical approach is not possible and therefore approximations based on observations are made.
The problems in NLP have in general not been solved. The opportunity given by Big Data, graph databases, and high processing speeds presents a real opportunity to part solve some of the big problems in NLP.
Open an account to get access to the team members of Spherical Systems
Already have an account? Log in
To comply with financial regulations, we can only show full campaign details to registered users.
Only shareholders can access this page
If you successfully purchase a share lot of this business, you will be granted access.
Buy sharesOnly shareholders can access this page
If you successfully purchase a share lot of this business, you will be granted access.
Buy sharesOnly shareholders can access this page
If you successfully purchase a share lot of this business, you will be granted access.
Buy sharesOnly shareholders can access this page
If you successfully purchase a share lot of this business, you will be granted access.
Buy shares