Gramáticas
30 min | Última modificación: Diciembre 10, 2020
Text Analytics with Python
Ejemplo de una gramática ambigüa
S -> NP VP
PP -> P NP
NP -> Det N
| Det N PP
| 'I'
VP -> V NP
| VP PP
Det -> 'an'
| 'my'
N -> 'elephant'
| 'pajamas'
V -> 'shot'
P -> 'in'
Aplicación a la frase:
I shot and elephant in my pijamas
Aplicación (parte 1)
Frase Regla Stack
------------------------------------------------------------------------------------------------------
>I< shot an elephant in my pijamas S
>I< shot an elephant in my pijamas S -> NP VP (S NP VP)
>I< shot an elephant in my pijamas NP -> Det N (S NP VP)
Falla Det -> 'an' | 'my' pero >I<
>I< shot an elephant in my pijamas NP -> Det N PP (S NP VP)
Falla Det -> 'an' | 'my' pero >I<
>I< shot an elephant in my pijamas NP -> 'I' (S (NP I) VP)
>shot< an elephant in my pijamas VP -> V NP (S (NP I) (VP V NP))
Ambiguo VP tiene dos posibles reglas
>shot< an elephant in my pijamas V -> 'shot' (S (NP I) (VP (V shot) NP))
>an< elephant in my pijamas NP -> Det N (S (NP I) (VP (V shot) (NP Det N)))
>an< elephant in my pijamas Det -> 'an' (S (NP I) (VP (V shot) (NP (Det an) N)))
>elephant< in my pijamas N -> 'Elephant' (S (NP I) (VP (V shot) (NP (Det an) (N Elephant)))
Falla. No se reconocio toda la frase
Aplicación (parte 2)
Frase Regla Stack
------------------------------------------------------------------------------------------------------
(de la tabla anterior)
>I< shot an elephant in my pijamas NP -> 'I' (S (NP I) VP)
>shot< an elephant in my pijamas VP -> VP PP (S (NP I) (VP VP PP))
Ambiguo VP tiene dos posibles reglas
>shot< an elephant in my pijamas VP -> V NP (S (NP I)
Ambiguo VP tiene dos posibles reglas (VP
(VP V NP)
PP))
>shot< an elephant in my pijamas V -> shot (S (NP I)
(VP
(VP (V shot) NP)
PP))
>an< elephant in my pijamas NP -> Det N (S (NP I)
Dos posibles reglas (VP
(VP (V shot) (NP Det N))
PP))
>an< elephant in my pijamas Det -> an (S (NP I)
(VP
(VP (V shot) (NP (Det an) N))
PP))
>elephant< in my pijamas N -> 'elephant' (S (NP I)
(VP
(VP (V shot) (NP (Det an) (N elephant))
PP))
>in< my pijamas PP -> P NP (S (NP I)
(VP
(VP (V shot) (NP (Det an) (N elephant))
(PP P NP)))
>in< my pijamas P -> in (S (NP I)
(VP
(VP (V shot) (NP (Det an) (N elephant))
(PP (P in) NP)))
>my< pijamas NP -> Det N (S (NP I)
(VP
(VP (V shot) (NP (Det an) (N elephant))
(PP (P in) (NP Det N))))
>my< pijamas Det -> my (S (NP I)
(VP
(VP (V shot) (NP (Det an) (N elephant))
(PP (P in) (NP (Det my) N))))
>pijamas< N -> pijamas (S (NP I)
(VP
(VP (V shot) (NP (Det an) (N elephant))
(PP (P in) (NP (Det my) (N pijamas)))))
>< (S (NP I)
(VP
(VP (V shot) (NP (Det an) (N elephant))
(PP (P in) (NP (Det my) (N pijamas)))))
Ejercicio Complete la tabla siguiente:
Frase Regla Stack
------------------------------------------------------------------------------------------------------
(de la tabla inicial)
>I< shot an elephant in my pijamas NP -> 'I' (S (NP I) VP)
>shot< an elephant in my pijamas VP -> V NP (S (NP I)
(VP
V
NP))
(Continue)
[1]:
import nltk
[2]:
##
## Implementación en NLTK
##
##
groucho_grammar = nltk.CFG.fromstring(
"""
S -> NP VP
PP -> P NP
NP -> Det N | Det N PP | 'I'
VP -> V NP | VP PP
Det -> 'an' | 'my'
N -> 'elephant' | 'pajamas'
V -> 'shot'
P -> 'in'
"""
)
##
## Dos posibles descomposiciones de la misma frase
## usando la gramatica anterior
##
sent = ["I", "shot", "an", "elephant", "in", "my", "pajamas"]
parser = nltk.ChartParser(groucho_grammar)
for tree in parser.parse(sent):
print(tree)
(S
(NP I)
(VP
(VP (V shot) (NP (Det an) (N elephant)))
(PP (P in) (NP (Det my) (N pajamas)))))
(S
(NP I)
(VP
(V shot)
(NP (Det an) (N elephant) (PP (P in) (NP (Det my) (N pajamas))))))
Gramática libre del contexto
[3]:
##
## Gramática libre del contexto
## Ejemplo de una gramática estructuralmente ambigua
##
grammar1 = nltk.CFG.fromstring(
"""
S -> NP VP
VP -> V NP | V NP PP
PP -> P NP
V -> "saw" | "ate" | "walked"
NP -> "John" | "Mary" | "Bob" | Det N | Det N PP
Det -> "a" | "an" | "the" | "my"
N -> "man" | "dog" | "cat" | "telescope" | "park"
P -> "in" | "on" | "by" | "with"
"""
)
sent = "Mary saw Bob".split()
rd_parser = nltk.RecursiveDescentParser(grammar1)
for tree in rd_parser.parse(sent):
print(tree)
##
## S Sentence
## NP Noun phrase
## VP Verb phrase
## PP Prepositional phrase
## Det Determiner
## N Noun
## V Verb
## P Preposition
##
(S (NP Mary) (VP (V saw) (NP Bob)))
Ejercicio Construya el arbol que deriva la expresión anterior.
Escritura de y almacenamiento de una gramática
[4]:
%%writefile mygrammar.cfg
S -> NP VP
VP -> V NP | V NP PP
PP -> P NP
V -> "saw" | "ate" | "walked"
NP -> "John" | "Mary" | "Bob" | Det N | Det N PP
Det -> "a" | "an" | "the" | "my"
N -> "man" | "dog" | "cat" | "telescope" | "park"
P -> "in" | "on" | "by" | "with"
Overwriting mygrammar.cfg
[5]:
##
## Lectura del archivo y parsing de la frase
##
import nltk
grammar1 = nltk.data.load('file:mygrammar.cfg')
sent = "Mary saw Bob".split()
rd_parser = nltk.RecursiveDescentParser(grammar1, trace=1)
for tree in rd_parser.parse(sent):
print(tree)
Parsing 'Mary saw Bob'
Found a parse:
(S (NP Mary) (VP (V saw) (NP Bob)))
(S (NP Mary) (VP (V saw) (NP Bob)))
[6]:
##
## Gramática recursiva
## Se dice que una gramática es recursiva si el nombre
## de la regla también aparece al lado derecho
## - Recursión directa: Nom -> Adj Nom
## - Recursión indirecta: S -> NP VP & VP -> V S
##
grammar2 = nltk.CFG.fromstring(
"""
S -> NP VP
NP -> Det Nom | PropN
Nom -> Adj Nom | N
VP -> V Adj | V NP | V S | V NP PP
PP -> P NP
PropN -> 'Buster' | 'Chatterer' | 'Joe'
Det -> 'the' | 'a'
N -> 'bear' | 'squirrel' | 'tree' | 'fish' | 'log'
Adj -> 'angry' | 'frightened' | 'little' | 'tall'
V -> 'chased' | 'saw' | 'said' | 'thought' | 'was' | 'put'
P -> 'on'
"""
)
[7]:
##
## Ejemplo de recursión por frases nominales anidadas
##
sent = "the angry bear chased the frightened little squirrel".split()
rd_parser = nltk.RecursiveDescentParser(grammar2)
for tree in rd_parser.parse(sent):
print(tree)
(S
(NP (Det the) (Nom (Adj angry) (Nom (N bear))))
(VP
(V chased)
(NP
(Det the)
(Nom (Adj frightened) (Nom (Adj little) (Nom (N squirrel)))))))
[8]:
##
## Ejemplo de recursión por sentencias anidadas
##
sent = "Chatterer said Buster thought the tree was tall".split()
rd_parser = nltk.RecursiveDescentParser(grammar2)
for tree in rd_parser.parse(sent):
print(tree)
(S
(NP (PropN Chatterer))
(VP
(V said)
(S
(NP (PropN Buster))
(VP
(V thought)
(S (NP (Det the) (Nom (N tree))) (VP (V was) (Adj tall)))))))
Parser recursivo descendente
Este parser NO puede manejar recusión por la izquierda (X -> X Y) ya que genera ciclos infinitos.
grammar1 = nltk.CFG.fromstring(
"""
S -> NP VP
VP -> V NP | V NP PP
PP -> P NP
V -> "saw" | "ate" | "walked"
NP -> "John" | "Mary" | "Bob" | Det N | Det N PP
Det -> "a" | "an" | "the" | "my"
N -> "man" | "dog" | "cat" | "telescope" | "park"
P -> "in" | "on" | "by" | "with"
"""
)
Parsing de la frase “Mary saw a dog”.
Frase Regla Stack
------------------------------------------------------------------------------------------------------
>Mary< saw a dog S
>Mary< saw a dog S -> NP VP (S NP VP)
>Mary< saw a dog NP -> "John" (S NP VP) Falla, backtracking
>Mary< saw a dog NP -> "Mary" (S (NP Mary) VP) Match, consume el símbolo
>saw< a dog VP -> V NP (S
(NP Mary)
(VP V NP))
>saw< a dog V -> "saw" (S Match, consume el símbolo
(NP Mary)
(VP (V saw) NP))
>a< dog NP -> "John" (S Falla, backtracking
(NP Mary)
(VP (V saw) NP))
>a< dog NP -> "Mary" (S Falla, backtracking
(NP Mary)
(VP (V saw) NP))
>a< dog NP -> "Bob" (S Falla, backtracking
(NP Mary)
(VP (V saw) NP))
>a< dog NP -> Det N (S
(NP Mary)
(VP
(V saw)
(NP Det N)))
>a< dog Det -> "a" (S Match, consume el símbolo
(NP Mary)
(VP
(V saw)
(NP (Det a) N)))
>dog< N -> "man" (S Falla, backtracking
(NP Mary)
(VP
(V saw)
(NP (Det a) N)))
>dog< N -> "dog" (S Match, consume el símbolo
(NP Mary)
(VP
(V saw)
(NP (Det a) (N dog))))
>< N -> "dog" (S
(NP Mary)
(VP
(V saw)
(NP (Det a) (N dog))))
La sentencia fue reconocida correctamente: No hay símbolos por consumir, y no quedan reglas por reemplazar en el stack.
[9]:
##
## Llame la función usando diferentes valores para el
## parámetro trace
##
rd_parser = nltk.RecursiveDescentParser(grammar1, trace=1)
sent = 'Mary saw a dog'.split()
for tree in rd_parser.parse(sent):
print(tree)
Parsing 'Mary saw a dog'
Found a parse:
(S (NP Mary) (VP (V saw) (NP (Det a) (N dog))))
(S (NP Mary) (VP (V saw) (NP (Det a) (N dog))))
Shift-reduce Parsing
Operación de un parser shift-reduce.
S -> NP VP
VP -> V NP
| V NP PP
PP -> P NP
V -> "saw"
| "ate"
| "walked"
NP -> "John"
| "Mary"
| "Bob"
| Det N
| Det N PP
Det -> "a"
| "an"
| "the"
| "my"
N -> "man"
| "dog"
| "cat"
| "telescope"
| "park"
P -> "in"
| "on"
| "by"
| "with"
Op. Regla Stack Remaining Text
-----------------------------------------------------------------------------------
the dog saw a man in the park<
shift the dog saw a man in the park<
reduce Det -> the (Det the) dog saw a man in the park<
shift (Det the) saw a man in the park<
dog
reduce N -> dog (Det the) saw a man in the park<
(N dog)
reduce NP -> Det N (NP saw a man in the park<
(Det the)
(N dog))
shift (NP a man in the park<
(Det the)
(N dog))
saw
reduce V -> saw (NP a man in the park<
(Det the)
(N dog))
(V saw)
shift (NP man in the park<
(Det the)
(N dog))
(V saw)
a
reduce Det -> a (NP man in the park<
(Det the)
(N dog))
(V saw)
(Det a)
shift (NP in the park<
(Det the)
(N dog))
(V saw)
(Det a)
man
reduce N -> man (NP in the park<
(Det the)
(N dog))
(V saw)
(Det a)
(N man)
reduce NP -> Det N (NP in the park<
(Det the)
(N dog))
(V saw)
(NP
(Det a)
(N man))
shift (NP the park<
(Det the)
(N dog))
(V saw)
(NP
(Det a)
(N man))
in
reduce P -> in (NP the park<
(Det the)
(N dog))
(V saw)
(NP
(Det a)
(N man))
(P in)
shift (NP park<
(Det the)
(N dog))
(V saw)
(NP
(Det a)
(N man))
(P in)
the
reduce P -> in (NP park<
(Det the)
(N dog))
(V saw)
(NP
(Det a)
(N man))
(P in)
(Det the)
shift (NP <
(Det the)
(N dog))
(V saw)
(NP
(Det a)
(N man))
(P in)
(Det the)
park
reduce N -> park (NP <
(Det the)
(N dog))
(V saw)
(NP
(Det a)
(N man))
(P in)
(Det the)
(N park)
reduce NP -> Det N (NP <
(Det the)
(N dog))
(V saw)
(NP
(Det a)
(N man))
(P in)
(NP
(Det the)
(N park))
reduce PP -> P NP (NP <
(Det the)
(N dog))
(V saw)
(NP
(Det a)
(N man))
(PP
(P in)
(NP
(Det the)
(N park)))
reduce NP -> NP PP (NP <
(Det the)
(N dog))
(V saw)
(NP
(NP
(Det a)
(N man))
(PP
(P in)
(NP
(Det the)
(N park))))
reduce VP -> V VP (NP <
(Det the)
(N dog))
(VP
(V saw)
(NP
(NP
(Det a)
(N man))
(PP
(P in)
(NP
(Det the)
(N park)))))
reduce S -> NP VP (S <
(NP
(Det the)
(N dog))
(VP
(V saw)
(NP
(NP
(Det a)
(N man))
(PP
(P in)
(NP
(Det the)
(N park))))))
OK! se llego a S
y no hay texto pendiente por procesar.
[10]:
##
## Ejemplo Shift-reduce parsing
##
sr_parser = nltk.ShiftReduceParser(grammar1, trace=1)
sent = 'Mary saw a dog'.split()
for tree in sr_parser.parse(sent):
print(tree)
Parsing 'Mary saw a dog'
[ * Mary saw a dog]
[ 'Mary' * saw a dog]
[ NP 'saw' * a dog]
[ NP V 'a' * dog]
[ NP V Det 'dog' * ]
(S (NP Mary) (VP (V saw) (NP (Det a) (N dog))))
Ejercicio.— Aplique un parser shift-reduce a la frase “Mary saw a dog”.