Information filtering system

An information filtering system is a system that removes redundant or unwanted information from an information stream using automated or computerized methods prior to presentation to a human user. Its main goal is the management of the information overload and increment of the semantic signal-to-noise ratio. To do this the user's profile is compared to some reference characteristics. These characteristics may originate from the information item or the user's social environment.
Whereas in information transmission signal processing filters are used against syntax-disrupting noise on the bit-level, the methods employed in information filtering act on the semantic level.
The range of machine methods employed builds on the same principles as those for information extraction. A notable application can be found in the field of email spam filters. Thus, it is not only the information explosion that necessitates some form of filters, but also inadvertently or maliciously introduced pseudo-information.
On the presentation level, information filtering takes the form of user-preferences-based newsfeeds, etc.
Recommender systems and content discovery platforms are active information filtering systems that attempt to present to the user information items the user is interested in. These systems add information items to the information flowing towards the user, as opposed to removing information items from the information flow towards the user. Recommender systems typically use collaborative filtering approaches or a combination of the collaborative filtering and content-based filtering approaches, although content-based recommender systems do exist.

History

Before the advent of the Internet, there are already several methods of filtering information; for instance, governments may control and restrict the flow of information in a given country by means of formal or informal censorship.
On the other hand, we are going to talk about information filters if we refer to newspaper editors and journalists when they provide a service that selects the most valuable information for their clients, readers of books, magazines, newspapers, radio listeners and viewers TV. This filtering operation is also present in schools and universities where there is a selection of information to provide assistance based on academic criteria to customers of this service, the students. With the advent of the Internet it is possible that anyone can publish anything he wishes at a low-cost. In this way, it increases considerably the less useful information and consequently the quality information is disseminated. With this problem, it began to devise new filtering with which we can get the information required for each specific topic to easily and efficiently.

Operation

A filtering system of this style consists of several tools that help people find the most valuable information, so the limited time you can dedicate to read / listen / view, is correctly directed to the most interesting and valuable documents. These filters are also used to organize and structure information in a correct and understandable way, in addition to group messages on the mail addressed. These filters are essential in the results obtained of the search engines on the Internet. The functions of filtering improves every day to get downloading Web documents and more efficient messages.

Criterion

One of the criteria used in this step is whether the knowledge is harmful or not, whether knowledge allows a better understanding with or without the concept. In this case the task of information filtering to reduce or eliminate the harmful information with knowledge.

Learning System

A system of learning content consists, in general rules, mainly of three basic stages:

First, a system that provides solutions to a defined set of tasks.
Subsequently, it undergoes assessment criteria which will measure the performance of the previous stage in relation to solutions of problems.
Acquisition module which its output obtained knowledge that are used in the system solver of the first stage.
Future

Currently the problem is not finding the best way to filter information, but the way that these systems require to learn independently the information needs of users. Not only because they automate the process of filtering but also the construction and adaptation of the filter. Some branches based on it, such as statistics, machine learning, pattern recognition and data mining, are the base for developing information filters that appear and adapt in base to experience. To allow the learning process can be carried out, part of the information has to be pre-filtered, it means there are positive and negative examples which we named training data, which can be generated by experts or, via feedback through ordinary users.

Error

As data is entered, the system includes new rules; if we consider that this data can generalize the training data information, then we have to evaluate the system development and measure the system's ability to correctly predict the categories of new information. This step is simplified by separating the training data in a new series called "test data" that we will use to measure the error rate. As a general rule it is important to distinguish between types of errors. For example, in the case on an aggregator of content for children, it doesn't have the same gravity to allow the passage of information not suitable for them, that shows violence or pornography, than the mistake to discard some appropriated information.
To improve the system to lower error rates and have these systems with learning capabilities similar to humans we require development of systems that simulate human cognitive abilities, such as natural language understanding, capturing meaning Common and other forms of advanced processing to achieve the semantics of information.

Fields of use

Nowadays, there are numerous techniques to develop information filters, some of these reach error rates lower than 10% in various experiments. Among these techniques there are decision trees, support vector machines, neural networks, Bayesian networks, linear discriminants, logistic regression, etc..
At present, these techniques are used in different applications, not only in the web context, but in thematic issues as varied as voice recognition, classification of telescopic astronomy or evaluation of financial risk.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...