a [f,@sddlmZmZmZddlmZddlmZm Z ddl m Z ddl m Z ddl mZddl mZmZdd l mZmZmZdd l mZmZdd l mZdd lmZdd lmZeeZe dkreZne ZGdddeZdS))absolute_importdivisionunicode_literals)unichr)deque OrderedDict) version_info)spaceCharacters)entities) asciiLettersasciiUpper2Lower)digits hexDigitsEOF) tokenTypes tagTokenTypes)replacementCharacters)HTMLInputStream)Trie)csdeZdZdZdfdd ZddZddZdd d Zd d ZddZ ddZ ddZ ddZ ddZ ddZddZddZddZd d!Zd"d#Zd$d%Zd&d'Zd(d)Zd*d+Zd,d-Zd.d/Zd0d1Zd2d3Zd4d5Zd6d7Zd8d9Zd:d;Zdd?Z!d@dAZ"dBdCZ#dDdEZ$dFdGZ%dHdIZ&dJdKZ'dLdMZ(dNdOZ)dPdQZ*dRdSZ+dTdUZ,dVdWZ-dXdYZ.dZd[Z/d\d]Z0d^d_Z1d`daZ2dbdcZ3dddeZ4dfdgZ5dhdiZ6djdkZ7dldmZ8dndoZ9dpdqZ:drdsZ;dtduZdzd{Z?d|d}Z@d~dZAddZBddZCddZDddZEddZFddZGddZHddZIddZJddZKddZLZMS) HTMLTokenizera  This class takes care of tokenizing HTML. * self.currentToken Holds the token that is currently being processed. * self.state Holds a reference to the method to be invoked... XXX * self.stream Points to HTMLInputStream object. Nc sJt|fi||_||_d|_g|_|j|_d|_d|_t t | dS)NF) rstreamparserZ escapeFlagZ lastFourChars dataStatestateescape currentTokensuperr__init__)selfrrkwargs __class__C/usr/lib/python3.9/site-packages/pip/_vendor/html5lib/_tokenizer.pyr (szHTMLTokenizer.__init__ccsPtg|_|rL|jjr6td|jjddVq|jr |jVq6q dS)z This is where the magic happens. We do our usually processing through the states and when we have a token to return we yield the token which pauses processing until the next token is requested. ParseErrorrtypedataN)r tokenQueuerrerrorsrpoppopleftr!r%r%r&__iter__7s  zHTMLTokenizer.__iter__c Cst}d}|rt}d}g}|j}||vrH|turH|||j}q"td||}|tvrt|}|j t ddd|idnd|krd ksn|d krd }|j t ddd|idnd |krd ks>nd|krdks>nd|krdks>nd|kr*dks>n|t gdvrZ|j t ddd|idz t |}Wn<t y|d}t d|d?Bt d|d@B}Yn0|dkr|j t ddd|j||S)zThis function returns either U+FFFD or the character based on the decimal or hexadecimal representation. It also discards ";" if present. If not present self.tokenQueue.append({"type": tokenTypes["ParseError"]}) is invoked. r'z$illegal-codepoint-for-numeric-entity charAsIntr)r*Zdatavarsii�r ii)# iiiiiiiiiiiiiiiiiii i i i i i i i i i iiiiir6iii;z numeric-entity-without-semicolonr()rrrcharrappendintjoinrr+r frozensetchr ValueErrorunget) r!ZisHexallowedradix charStackcr4r?vr%r%r&consumeNumberEntityGsn              &   z!HTMLTokenizer.consumeNumberEntityFc Csd}|jg}|dtvsB|dtddfvsB|durV||dkrV|j|dn|ddkr d}||j|ddvrd}||j|r|dtvs|s|dtvr|j|d||}n4|j t d d d |j| dd |}nf|dturDt d |s0qD||jq z$t d |dd}t|}Wntyd}Yn0|dur>|dd kr|j t d dd |dd kr|r||tvs||tvs||dkr|j| dd |}n.t|}|j| |d ||d7}n4|j t d dd |j| dd |}|r|jddd|7<n*|tvrd}nd}|j t ||d dS)N&r<#F)xXTr'zexpected-numeric-entityr(r3r>znamed-entity-without-semicolon=zexpected-named-entityr*r SpaceCharacters Characters)rr?r rrFr@rrrLr+rr-rB entitiesTrieZhas_keys_with_prefixZlongest_prefixlenKeyErrorr r r) r! allowedChar fromAttributeoutputrIhexZ entityNameZ entityLengthZ tokenTyper%r%r& consumeEntitys~               zHTMLTokenizer.consumeEntitycCs|j|dddS)zIThis method replaces the need for "entityInAttributeValueState". T)rYrZN)r])r!rYr%r%r&processEntityInAttributesz&HTMLTokenizer.processEntityInAttributecCs|j}|dtvr|dt|d<|dtdkrp|d}t|}t|t|krh||ddd||d<|dtdkr|dr|j tdd d |d r|j tdd d |j ||j |_ dS) zThis method is a generic handler for emitting the tags. It also sets the state to "data" because that's what's needed after a token has been emitted. r)nameStartTagr*NrPEndTagr'zattributes-in-end-tagr( selfClosingzself-closing-flag-on-end-tag) rr translater r attributeMaprWupdater+r@rr)r!tokenrawr*r%r%r&emitCurrentTokens(    zHTMLTokenizer.emitCurrentTokencCs|j}|dkr|j|_n|dkr.|j|_n|dkrd|jtddd|jtdddn`|turpdS|t vr|jtd ||j t d dn&|j d }|jtd||dd S) NrMrNr'invalid-codepointr(rUFrTTrMrNri) rr?entityDataStater tagOpenStater+r@rrr charsUntilr!r*charsr%r%r&rs.          zHTMLTokenizer.dataStatecCs||j|_dSNT)r]rrr/r%r%r&rlszHTMLTokenizer.entityDataStatecCs|j}|dkr|j|_n|dkr.|j|_n|tkr:dS|dkrp|jtddd|jtdd dnT|t vr|jtd ||j t d dn&|j d }|jtd||dd S) NrMrNFrir'rjr(rUr7rTTrk) rr?characterReferenceInRcdatarrcdataLessThanSignStaterr+r@rr rnror%r%r& rcdataState"s.          zHTMLTokenizer.rcdataStatecCs||j|_dSrq)r]rtrr/r%r%r&rr?sz(HTMLTokenizer.characterReferenceInRcdatacCs|j}|dkr|j|_nh|dkrR|jtddd|jtdddn2|tkr^dS|jd }|jtd||dd S NrNrir'rjr(rUr7F)rNriT) rr?rawtextLessThanSignStaterr+r@rrrnror%r%r& rawtextStateDs"       zHTMLTokenizer.rawtextStatecCs|j}|dkr|j|_nh|dkrR|jtddd|jtdddn2|tkr^dS|jd }|jtd||dd Sru) rr?scriptDataLessThanSignStaterr+r@rrrnror%r%r&scriptDataStateVs"       zHTMLTokenizer.scriptDataStatecCsr|j}|tkrdS|dkrL|jtddd|jtdddn"|jtd||jdddS) NFrir'rjr(rUr7T)rr?rr+r@rrnr!r*r%r%r&plaintextStatehs     zHTMLTokenizer.plaintextStatecCs |j}|dkr|j|_n|dkr.|j|_n|tvrVtd|gddd|_|j|_n|dkr|j tddd |j td d d |j |_nt|d kr|j tdd d |j ||j |_n@|j tddd |j td dd |j ||j |_dS)N!/r`F)r)r_r*rbZselfClosingAcknowledged>r'z'expected-tag-name-but-got-right-bracketr(rUz<>?z'expected-tag-name-but-got-question-markzexpected-tag-namerNT)rr?markupDeclarationOpenStatercloseTagOpenStater rr tagNameStater+r@rrFbogusCommentStaterzr%r%r&rmws>           zHTMLTokenizer.tagOpenStatecCs|j}|tvr0td|gdd|_|j|_n|dkrX|jtddd|j |_nn|t ur|jtddd|jtd d d|j |_n0|jtdd d |id |j ||j |_dS)NraFr)r_r*rbr~r'z*expected-closing-tag-but-got-right-bracketr(z expected-closing-tag-but-got-eofrU|tkr|jtdd d|j |_n|jtd|dd S NrrUr(rNrir'rjr7eof-in-script-in-scriptT) rr?r+r@r scriptDataDoubleEscapedDashStater(scriptDataDoubleEscapedLessThanSignStaterrrzr%r%r&rs*        z*HTMLTokenizer.scriptDataDoubleEscapedStatecCs|j}|dkr2|jtddd|j|_n|dkrZ|jtddd|j|_n|dkr|jtddd|jtddd|j|_nF|t kr|jtdd d|j |_n|jtd|d|j|_d Sr) rr?r+r@r$scriptDataDoubleEscapedDashDashStaterrrrrrzr%r%r&rs.        z.HTMLTokenizer.scriptDataDoubleEscapedDashStatecCs|j}|dkr*|jtdddn|dkrR|jtddd|j|_n|dkrz|jtddd|j|_n|dkr|jtddd|jtdd d|j|_nF|t kr|jtdd d|j |_n|jtd|d|j|_d S) NrrUr(rNr~rir'rjr7rT) rr?r+r@rrrryrrrrzr%r%r&r%s2        z2HTMLTokenizer.scriptDataDoubleEscapedDashDashStatecCsP|j}|dkr8|jtdddd|_|j|_n|j||j |_dS)Nr}rUr(r3T) rr?r+r@rrscriptDataDoubleEscapeEndStaterrFrrzr%r%r&r>s   z6HTMLTokenizer.scriptDataDoubleEscapedLessThanSignStatecCs|j}|ttdBvrR|jtd|d|jdkrH|j |_ q|j |_ nB|t vr|jtd|d|j|7_n|j ||j |_ dSr)rr?r rCr+r@rrrrrrr rFrzr%r%r&rIs    z,HTMLTokenizer.scriptDataDoubleEscapeEndStatecCs0|j}|tvr$|jtdn|tvrJ|jd|dg|j|_n|dkr\| n|dkrn|j |_n|dvr|j t ddd |jd|dg|j|_n|d kr|j t dd d |jdd dg|j|_nF|t ur|j t dd d |j|_n|jd|dg|j|_dS)NTr*r3r~r})'"rSrNr'#invalid-character-in-attribute-namer(rirjr7z#expected-attribute-name-but-got-eof)rr?r rnr rr@attributeNameStaterrhrr+rrrrzr%r%r&rYs<           z&HTMLTokenizer.beforeAttributeNameStatecCs|j}d}d}|dkr&|j|_n.|tvr\|jddd||jtd7<d}n|dkrjd}n|tvr||j|_n|dkr|j |_n|d kr|j t d d d |jdddd 7<d}n|dvr |j t d dd |jddd|7<d}nH|t ur6|j t d dd |j|_n|jddd|7<d}|r|jdddt|jddd<|jdddD]>\}}|jddd|kr|j t d dd qҐq|r|dS)NTFrSr*rPrr~r}rir'rjr(r7rrrNrzeof-in-attribute-namezduplicate-attribute)rr?beforeAttributeValueStaterr rrnr afterAttributeNameStaterr+r@rrrrcr rh)r!r*ZleavingThisStateZ emitTokenr__r%r%r&rws^             z HTMLTokenizer.attributeNameStatecCsD|j}|tvr$|jtdn|dkr8|j|_n|dkrJ|n|tvrp|jd |dg|j |_n|dkr|j |_n|dkr|j t dd d |jd d dg|j |_n|d vr|j t dd d |jd |dg|j |_nF|tur$|j t ddd |j|_n|jd |dg|j |_dS)NTrSr~r*r3r}rir'rjr(r7rz&invalid-character-after-attribute-namezexpected-end-of-tag-but-got-eof)rr?r rnrrrhr rr@rrr+rrrrzr%r%r&rs@            z%HTMLTokenizer.afterAttributeNameStatecCsh|j}|tvr$|jtdn@|dkr8|j|_n,|dkrX|j|_|j|n |dkrj|j|_n|dkr|j t ddd| n|d kr|j t dd d|j d d d d7<|j|_n|dvr|j t ddd|j d d d |7<|j|_nL|turB|j t ddd|j|_n"|j d d d |7<|j|_dS)NTrrMrr~r'z.expected-attribute-value-but-got-right-bracketr(rirjr*rPr r7)rSrN`z"equals-in-unquoted-attribute-valuez$expected-attribute-value-but-got-eof)rr?r rnattributeValueDoubleQuotedStaterattributeValueUnQuotedStaterFattributeValueSingleQuotedStater+r@rrhrrrrzr%r%r&rsF             z'HTMLTokenizer.beforeAttributeValueStatecCs|j}|dkr|j|_n|dkr0|dn|dkrj|jtddd|jddd d 7<nN|t ur|jtdd d|j |_n&|jddd ||j d 7<d S)NrrMrir'rjr(r*rPr r7z#eof-in-attribute-value-double-quote)rrMriT rr?afterAttributeValueStaterr^r+r@rrrrrnrzr%r%r&rs&       z-HTMLTokenizer.attributeValueDoubleQuotedStatecCs|j}|dkr|j|_n|dkr0|dn|dkrj|jtddd|jddd d 7<nN|t ur|jtdd d|j |_n&|jddd ||j d 7<d S)NrrMrir'rjr(r*rPr r7z#eof-in-attribute-value-single-quote)rrMriTrrzr%r%r&rs&       z-HTMLTokenizer.attributeValueSingleQuotedStatecCs|j}|tvr|j|_n|dkr0|dn|dkrB|n|dvr||jt ddd|j ddd |7<n|d kr|jt dd d|j ddd d 7<nV|t ur|jt dd d|j |_n.|j ddd ||j tdtB7<dS)NrMr~)rrrSrNrr'z0unexpected-character-in-unquoted-attribute-valuer(r*rPr rirjr7z eof-in-attribute-value-no-quotes)rMr~rrrSrNrriT)rr?r rrr^rhr+r@rrrrrnrCrzr%r%r&rs4         z)HTMLTokenizer.attributeValueUnQuotedStatecCs|j}|tvr|j|_n|dkr.|np|dkr@|j|_n^|turt|j t ddd|j ||j |_n*|j t ddd|j ||j|_dS)Nr~r}r'z$unexpected-EOF-after-attribute-valuer(z*unexpected-character-after-attribute-valueT) rr?r rrrhrrr+r@rrFrrzr%r%r&r.s&         z&HTMLTokenizer.afterAttributeValueStatecCs|j}|dkr&d|jd<|n^|turZ|jtddd|j||j |_ n*|jtddd|j||j |_ dS)Nr~Trbr'z#unexpected-EOF-after-solidus-in-tagr(z)unexpected-character-after-solidus-in-tag) rr?rrhrr+r@rrFrrrrzr%r%r&rBs         z&HTMLTokenizer.selfClosingStartTagStatecCsD|jd}|dd}|jtd|d|j|j|_dS)Nr~rir7Commentr(T) rrnreplacer+r@rr?rrrzr%r%r&rTs    zHTMLTokenizer.bogusCommentStatecCs|jg}|ddkrR||j|ddkrPtddd|_|j|_dSn|ddvrd}dD](}||j|d|vrfd }qqf|rtd ddddd |_|j|_dSn|dd krD|jdurD|jj j rD|jj j dj |jj j krDd}d D].}||j|d|krd }q2q|rD|j |_dS|jtddd|rt|j|qZ|j|_dS)NrPrrr3r(T)dD))oOrJCtTyYpPeEFZDoctype)r)r_publicIdsystemIdcorrect[)rrArrrr'zexpected-dashes-or-doctype)rr?r@rrcommentStartStater doctypeStaterZtreeZ openElements namespaceZdefaultNamespacecdataSectionStater+rFr-r)r!rImatchedexpectedr%r%r&rcsZ       z(HTMLTokenizer.markupDeclarationOpenStatecCs|j}|dkr|j|_n|dkrN|jtddd|jdd7<n|dkr|jtdd d|j|j|j|_nP|t ur|jtdd d|j|j|j|_n|jd|7<|j |_d S) Nrrir'rjr(r*r7r~incorrect-commenteof-in-commentT) rr?commentStartDashStaterr+r@rrrr commentStaterzr%r%r&rs.       zHTMLTokenizer.commentStartStatecCs|j}|dkr|j|_n|dkrN|jtddd|jdd7<n|dkr|jtdd d|j|j|j|_nT|t ur|jtdd d|j|j|j|_n|jdd|7<|j |_d S) Nrrir'rjr(r*-�r~rrT) rr?commentEndStaterr+r@rrrrrrzr%r%r&rs.       z#HTMLTokenizer.commentStartDashStatecCs|j}|dkr|j|_n|dkrN|jtddd|jdd7<nT|tur|jtddd|j|j|j |_n|jd||j d 7<d S) Nrrir'rjr(r*r7r)rriT) rr?commentEndDashStaterr+r@rrrrrnrzr%r%r&rs$       zHTMLTokenizer.commentStatecCs|j}|dkr|j|_n|dkrV|jtddd|jdd7<|j|_nT|t ur|jtddd|j|j|j |_n|jdd|7<|j|_d S) Nrrir'rjr(r*rzeof-in-comment-end-dashT) rr?rrr+r@rrrrrrzr%r%r&rs$      z!HTMLTokenizer.commentEndDashStatecCs,|j}|dkr*|j|j|j|_n|dkrd|jtddd|jdd7<|j|_n|dkr|jtdd d|j |_n|d kr|jtdd d|jd|7<nj|t ur|jtdd d|j|j|j|_n4|jtdd d|jdd|7<|j|_dS)Nr~rir'rjr(r*u--�r|z,unexpected-bang-after-double-dash-in-commentrz,unexpected-dash-after-double-dash-in-commentzeof-in-comment-double-dashzunexpected-char-in-commentz--T) rr?r+r@rrrrrcommentEndBangStaterrzr%r%r&rs@          zHTMLTokenizer.commentEndStatecCs|j}|dkr*|j|j|j|_n|dkrN|jdd7<|j|_n|dkr|jtddd|jdd 7<|j |_nT|t ur|jtdd d|j|j|j|_n|jdd|7<|j |_d S) Nr~rr*z--!rir'rjr(u--!�zeof-in-comment-end-bang-stateT) rr?r+r@rrrrrrrrzr%r%r&rs,       z!HTMLTokenizer.commentEndBangStatecCs|j}|tvr|j|_nj|tur\|jtdddd|j d<|j|j |j |_n*|jtddd|j ||j|_dS)Nr'!expected-doctype-name-but-got-eofr(Frzneed-space-after-doctypeT) rr?r beforeDoctypeNameStaterrr+r@rrrrFrzr%r%r&rs        zHTMLTokenizer.doctypeStatecCs|j}|tvrn|dkrT|jtdddd|jd<|j|j|j|_n|dkr|jtdddd |jd <|j |_nR|t ur|jtdd dd|jd<|j|j|j|_n||jd <|j |_d S) Nr~r'z+expected-doctype-name-but-got-right-bracketr(Frrirjr7r_rT) rr?r r+r@rrrrdoctypeNameStaterrzr%r%r&r*s4           z$HTMLTokenizer.beforeDoctypeNameStatecCs|j}|tvr2|jdt|jd<|j|_n|dkrh|jdt|jd<|j |j|j |_n|dkr|j t ddd|jdd7<|j |_nh|t ur|j t dddd |jd <|jdt|jd<|j |j|j |_n|jd|7<d S) Nr_r~rir'rjr(r7zeof-in-doctype-nameFrT)rr?r rrcr afterDoctypeNameStaterr+r@rrrrrzr%r%r&rDs0        zHTMLTokenizer.doctypeNameStatecCsH|j}|tvrn.|dkr8|j|j|j|_n |turd|jd<|j ||jt ddd|j|j|j|_n|dvrd}d D]}|j}||vrd}qq|r|j |_dSnD|d vr d}d D]}|j}||vrd}qq|r |j |_dS|j ||jt dd d |idd|jd<|j |_dS)Nr~Frr'eof-in-doctyper(rT))uU)bB)lL)iIrsS)rrrr)mMz*expected-space-or-right-bracket-in-doctyper*r5)rr?r r+r@rrrrrFrafterDoctypePublicKeywordStateafterDoctypeSystemKeywordStatebogusDoctypeState)r!r*rrr%r%r&r]sT            z#HTMLTokenizer.afterDoctypeNameStatecCs|j}|tvr|j|_n|dvrP|jtddd|j||j|_nT|t ur|jtdddd|j d<|j|j |j |_n|j||j|_dS N)rrr'unexpected-char-in-doctyper(rFrT) rr?r "beforeDoctypePublicIdentifierStaterr+r@rrFrrrrzr%r%r&rs&         z,HTMLTokenizer.afterDoctypePublicKeywordStatecCs|j}|tvrn|dkr0d|jd<|j|_n|dkrLd|jd<|j|_n|dkr|jt dddd |jd <|j|j|j |_nh|t ur|jt dd dd |jd <|j|j|j |_n(|jt dd dd |jd <|j |_d S)Nrr3rrr~r'unexpected-end-of-doctyper(FrrrT) rr?r r(doctypePublicIdentifierDoubleQuotedStater(doctypePublicIdentifierSingleQuotedStater+r@rrrrrzr%r%r&rs:             z0HTMLTokenizer.beforeDoctypePublicIdentifierStatecCs|j}|dkr|j|_n|dkrN|jtddd|jdd7<n|dkr|jtdd dd |jd <|j|j|j|_nR|t ur|jtdd dd |jd <|j|j|j|_n|jd|7<d S)Nrrir'rjr(rr7r~rFrrT rr?!afterDoctypePublicIdentifierStaterr+r@rrrrrzr%r%r&rs0         z6HTMLTokenizer.doctypePublicIdentifierDoubleQuotedStatecCs|j}|dkr|j|_n|dkrN|jtddd|jdd7<n|dkr|jtdd dd |jd <|j|j|j|_nR|t ur|jtdd dd |jd <|j|j|j|_n|jd|7<d S)Nrrir'rjr(rr7r~rFrrTrrzr%r%r&rs0         z6HTMLTokenizer.doctypePublicIdentifierSingleQuotedStatecCs |j}|tvr|j|_n|dkr<|j|j|j|_n|dkrn|jt dddd|jd<|j |_n|dkr|jt dddd|jd<|j |_nh|t ur|jt dd dd |jd <|j|j|j|_n(|jt dddd |jd <|j |_d S) Nr~rr'rr(r3rrrFrT)rr?r -betweenDoctypePublicAndSystemIdentifiersStaterr+r@rrr(doctypeSystemIdentifierDoubleQuotedState(doctypeSystemIdentifierSingleQuotedStaterrrzr%r%r&rs>              z/HTMLTokenizer.afterDoctypePublicIdentifierStatecCs|j}|tvrn|dkr4|j|j|j|_n|dkrPd|jd<|j|_n|dkrld|jd<|j |_nh|t kr|jt dddd |jd <|j|j|j|_n(|jt dd dd |jd <|j |_d S) Nr~rr3rrr'rr(FrrT) rr?r r+r@rrrrrrrrrzr%r%r&rs2           z;HTMLTokenizer.betweenDoctypePublicAndSystemIdentifiersStatecCs|j}|tvr|j|_n|dvrP|jtddd|j||j|_nT|t ur|jtdddd|j d<|j|j |j |_n|j||j|_dSr) rr?r "beforeDoctypeSystemIdentifierStaterr+r@rrFrrrrzr%r%r&r)s&         z,HTMLTokenizer.afterDoctypeSystemKeywordStatecCs|j}|tvrn|dkr0d|jd<|j|_n|dkrLd|jd<|j|_n|dkr|jt dddd |jd <|j|j|j |_nh|t ur|jt dd dd |jd <|j|j|j |_n(|jt dddd |jd <|j |_d S) Nrr3rrr~r'rr(FrrT) rr?r rrrrr+r@rrrrrzr%r%r&r=s:             z0HTMLTokenizer.beforeDoctypeSystemIdentifierStatecCs|j}|dkr|j|_n|dkrN|jtddd|jdd7<n|dkr|jtdd dd |jd <|j|j|j|_nR|t ur|jtdd dd |jd <|j|j|j|_n|jd|7<d S)Nrrir'rjr(rr7r~rFrrT rr?!afterDoctypeSystemIdentifierStaterr+r@rrrrrzr%r%r&rZs0         z6HTMLTokenizer.doctypeSystemIdentifierDoubleQuotedStatecCs|j}|dkr|j|_n|dkrN|jtddd|jdd7<n|dkr|jtdd dd |jd <|j|j|j|_nR|t ur|jtdd dd |jd <|j|j|j|_n|jd|7<d S)Nrrir'rjr(rr7r~rFrrTrrzr%r%r&rrs0         z6HTMLTokenizer.doctypeSystemIdentifierSingleQuotedStatecCs|j}|tvrn~|dkr4|j|j|j|_n^|turt|jt dddd|jd<|j|j|j|_n|jt ddd|j |_dS) Nr~r'rr(FrrT) rr?r r+r@rrrrrrrzr%r%r&rs$      z/HTMLTokenizer.afterDoctypeSystemIdentifierStatecCsZ|j}|dkr*|j|j|j|_n,|turV|j||j|j|j|_ndS)Nr~T) rr?r+r@rrrrrFrzr%r%r&rs    zHTMLTokenizer.bogusDoctypeStatecCsg}||jd||jd|j}|tkr>qq|dksJJ|ddddkrv|ddd|d<qq||qd|}|d}|dkrt|D]}|jt d d d q| dd }|r|jt d |d |j |_ dS)N]r~rPz]]r3rirr'rjr(r7rUT) r@rrnr?rrBcountranger+rrrr)r!r*r?Z nullCountrr%r%r&rs2          zHTMLTokenizer.cdataSectionState)N)NF)N__name__ __module__ __qualname____doc__r r0rLr]r^rhrrlrtrrrwryr{rmrrrsrrrvrrrxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr __classcell__r%r%r#r&rs H P#         6 "-3rN) Z __future__rrrZpip._vendor.sixrrD collectionsrrsysrZ constantsr r r r rrrrrrZ _inputstreamrZ_trierrVdictrdobjectrr%r%r%r&s