Given the Boom of Deep Learning Approaches
INFOTEC
INFOTEC
Centro Público de Investigación del Gobierno Federal, que contribuye a la Transformación Digital de México, a través de la investigación, la innovación, la formación académica y el desarrollo de productos y servicios TIC. Sus alcances abarcan al sector público y privado, habilitando caminos que conduzcan hacia un México moderno y de inclusión digital.
GitHub: https://github.com/INGEOTEC
WebPage: https://ingeotec.github.io/
Artificial Intelligence (AI)
Theory and development of computer systems able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages.
Machine Learning
Machine learning (ML) is a subfield of artificial intelligence that focuses on the development and implementation of algorithms capable of learning from data without being explicitly programmed.
Natural Language Processing (NLP)
NLP is a branch of artificial intelligence (AI) that uses machine learning and other technologies to enable computers to understand, process, and manipulate human language.
How we look at it
How I see it
Problem
Classifier
Definition
The aim is the classification of documents into a fixed number of predefined categories.
Polarity
El día de mañana no podré ir con ustedes a la librería
Negative
https://ingeotec.github.io/Delitos
texto | etiqueta | |
---|---|---|
0 | Dile que soy quien te llama y aún respondes🎶 | N |
1 | policia: Detenido en #Zaragoza el organizador ... | P |
2 | Qué mala hostia me ha entrado en un momento po... | N |
3 | Gran trabajo compañero. Cuatro peligrosos y vi... | P |
4 | El CNI no ha detectado ningún ciberataque del ... | N |
5 | #SucesosProvincia Hombre es ultimado a tiros e... | P |
6 | Un autobús y un turismo chocan tras saltarse e... | N |
7 | La PNC reporta la captura de pandillero que ma... | P |
Bag of Words
Associate token \(t\)
\[ \mathbf{v_t} \in \mathbb R^d \]
Bag of words
\[ \mathbf x = \frac{\sum_t \mathbf{v_t}}{\lVert \sum_t \mathbf{v_t} \rVert} \]
Orthogonal
\[ \forall_{i \neq j} \mathbf{v_i} \cdot \mathbf{v_j} = 0 \]
Consequences
Document / TFIDF
Select a token
Supervised learning
Classification
CBOW
Dense representation
Classification
Linear Classifier
Dense representation
Classification
Outside NLP
Attention is All you Need
BERT
Equation
\[ \textsf{att}(Q, K, V) = \textsf{softmax}(\frac{QK^\intercal}{\sqrt{d_k}}) V \]
Parts
Analysis
Definiciones