)uycPddlZddlmZddlmZdZdZdZ ddlm Z m Z m Z ddl m Z mZn #e$rYnwxYwGd d eZed eZed eZ ddZedd ZdZdS)N)BufferingIterator) Deb822Tokenis_value is_comment is_separator)IteratorUnionLiteral)TokenOrElementFormatterCallbackceZdZdZdZdZedZedZedZ edZ e dZ e d Z e d Ze d Ze d Zd ZdZdS)FormatterContentTokenzTyped, tagged text for use with the formatting API The FormatterContentToken is used by the formatting API and provides the formatter callback with context about the textual tokens it is supposed to format. _text _content_typec"||_||_dSNr)selftext content_types @/usr/lib/python3/dist-packages/debian/_deb822_repro/formatter.py__init__zFormatterContentToken.__init__s )ct|trQ|jr||jS|jrt d||jS||S)Nz)FormatterContentType cannot be whitespace) isinstancerr comment_tokenr is_whitespace ValueError value_tokenconvert_to_text)clstoken_or_elements rfrom_token_or_elementz+FormatterContentToken.from_token_or_element"s & 4 4 :* @(()9)>???- N !LMMM??#3#899 9/??AABBBrcX|dkrtS|dkrtS||tS)N ,)SPACE_SEPARATOR_FTCOMMA_SEPARATOR_FT_CONTENT_TYPE_SEPARATORr!rs rseparator_tokenz%FormatterContentToken.separator_token/s5 3;;% % 3;;% %s40111rc$||tS)zoGenerates a single comment token with the provided text Mostly useful for creating test cases )_CONTENT_TYPE_COMMENTr*s rrz#FormatterContentToken.comment_token9ss4.///rc$||tS)zmGenerates a single value token with the provided text Mostly useful for creating test cases )_CONTENT_TYPE_VALUEr*s rrz!FormatterContentToken.value_tokenBss4,---rc|jtuS)aTrue if this formatter token represent a comment This should be used for determining whether the token is a comment or not. It might be tempting to check whether the text in the token starts with a "#" but that is insufficient because a value *can* start with that as well. Whether it is a comment or a value is based on the context (it is a comment if and only if the "#" was at the start of a line) but the formatter often do not have the context available to assert this. The formatter *should* preserve the order of comments and interleave between the value tokens in the same order as it see them. Failing to preserve the order of comments and values can cause confusing comments (such as associating the comment with a different value than it was written for). The formatter *may* discard comment tokens if it does not want to preserve them. If so, they would be omitted in the output, which may be acceptable in some cases. This is a lot better than re-ordering comments. Formatters must be aware of the following special cases for comments: * Comments *MUST* be emitted after a newline. If the very first token is a comment, the formatter is expected to emit a newline before it as well (Fields cannot start immediately on a comment). )rr-rs rrz FormatterContentToken.is_commentKs:!%:::rc|jtuS)atTrue if this formatter token represents a semantic value The formatter *MUST* preserve values as-in in its output. It may "unpack" it from the token (as in, return it as a part of a plain str) but the value content must not be changed nor re-ordered relative to other value tokens (as that could change the meaning of the field). )rr/r1s rrzFormatterContentToken.is_valuejs!%888rc|jtuS)aTrue if this formatter token represents a separator token The formatter is not required to preserve the provided separators but it is required to properly separate values. In fact, often is a lot easier to discard existing separator tokens. As an example, in whitespace separated list of values space, tab and newline all counts as separator. However, formatting-wise, there is a world of difference between the a space, tab and a newline. In particularly, newlines must be followed by an additional space or tab (to act as a value continuation line) if there is a value following it (otherwise, the generated output is invalid). )rr)r1s rrz"FormatterContentToken.is_separatorvs!%<<rrrrsP+I***  C C[ C22[200[0..[.;;X;< 9 9X 9==X= VVXV X"cccccrrr%r&TFcPdkrdkrtdfd}|S)aProvide a simple formatter that can handle indentation and trailing separators All formatters returned by this function puts exactly one value per line. This pattern is commonly seen in the "Depends" field and similar fields of debian/control files. :param indentation: Either the literal string "FIELD_NAME_LENGTH" or a positive integer, which determines the indentation for fields. If it is an integer, then a fixed indentation is used (notably the value 1 ensures the shortest possible indentation). Otherwise, if it is "FIELD_NAME_LENGTH", then the indentation is set such that it aligns the values based on the field name. :param trailing_separator: If True, then the last value will have a trailing separator token (e.g., ",") after it. :param immediate_empty_line: Whether the value should always start with an empty line. If True, then the result becomes something like "Field: value". FIELD_NAME_LENGTHz7indentation must be at least 1 (or "FIELD_NAME_LENGTH")c3TK dkrt|dz}n }d|z}d}t|}tjd} rd}dV|D]T}|jr |sdV|Vn>|jr6|sdVn|V|V|js s||r|VdVnRd}UdS)NrGr%FrT )lenroperator attrgetterrrr peek_find) name sep_tokenformatter_tokens indent_lenindentemitted_first_linetok_iterrtimmediate_empty_line indentationtrailing_separators r _formatterz0one_value_per_line_formatter.._formatters) - - -TQJJ$Jz!"$%566&z22  !% JJJ & &A| )JJJ )!IIII LLL .$4F$7?7I7I(7S7S$#OOO !%  % & &r)r)rYrZrXr[s``` rone_value_per_line_formatterr\sV.)))kAooTUUU$&$&$&$&$&$&$&J rrG)rZc|dg}d}d}t|tr|d}|jrtd||||D]}t |} t|t r|jrZ|std| dstd| dstd nl|jre| d  s| d rtd |rtd |rtd |j}nd}|rW| d dvrtd| d  s$| dstd| | | d}d |} | dstd| S)a!Format a field using a provided formatter This function formats a series of tokens using the provided formatter. It can be used as a standalone formatter engine and can be used in test suites to validate third-party formatters (enabling them to test for corner cases without involving parsing logic). The formatter receives series of FormatterContentTokens (via the token_iter) and is expected to yield one or more str or FormatterContentTokens. The calling function will combine all of these into a single string, which will be used as the value. The formatter is recommended to yield the provided value and comment tokens interleaved with text segments of whitespace and separators as part of its output. If it preserve comment and value tokens, the calling function can provide some runtime checks to catch bugs (like the formatter turning a comment into a value because it forgot to ensure that the comment was emitted directly after a newline character). When writing a formatter, please keep the following in mind: * The output of the formatter is appended directly after the ":" separator. Most formatters will want to emit either a space or a newline as the very first character for readability. (compare "Depends:foo\n" to "Depends: foo\n") * The formatter must always end its output on a newline. This is a design choice of how the round-trip safe parser represent values that is imposed on the formatter. * It is often easier to discard/ignore all separator tokens from the the provided token sequence and instead just yield separator tokens/str where the formatter wants to place them. - The formatter is strongly recommended to special-case formatting for whitespace separators (check for `separator_token.is_whitespace`). This is because space, tab and newline all counts as valid separators and can all appear in the token sequence. If the original field uses a mix of these separators it is likely to completely undermine the desired result. Not to mention the additional complexity of handling when a separator token happens to use the newline character which affects how the formatter is supposed what comes after it (see the rules for comments, empty lines and continuation line markers). * The formatter must remember to emit a "continuation line" marker (typically a single space or tab) when emitting a value after a newline or a comment. A `yield " "` is sufficient. - The continuation line marker may be embedded inside a str with other whitespace (such as the newline coming before it or/and whitespace used for indentation purposes following the marker). * The formatter must not cause the output to contain completely empty/whitespace lines as these cause syntax errors. The first line never counts as an empty line (as it will be appended after the field name). * Tokens must be discriminated via the `token.is_value` (etc.) properties. Assuming that `token.text.startswith("#")` implies a comment and similar stunts are wrong. As an example, "#foo" is a perfectly valid value in some contexts. * Comment tokens *always* take up exactly one complete line including the newline character at the end of the line. They must be emitted directly after a newline character or another comment token. * Special cases that are rare but can happen: - Fields *can* start with comments and requires a formatter provided newline. (Example: "Depends:\n# Comment here\n foo") - Fields *can* start on a separator or have two separators in a row. This is especially true for whitespace separated fields where every whitespace counts as a separator, but it can also happen with other separators (such as comma). - Value tokens can contain whitespace (for non-whitespace separators). When they do, the formatter must not attempt change nor "normalize" the whitespace inside the value token as that might change how the value is interpreted. (If you want to normalize such whitespace, the formatter is at the wrong abstraction level. Instead, manipulate the values directly in the value interpretation layer) This function will provide *some* runtime checks of its input and the output from the formatter to detect some errors early and provide helpful diagnostics. If you use the function for testing, you are recommended to rely on verifying the output of the function rather than relying on the runtime checks (as these are subject to change). :param formatter: A formatter (see FormatterCallback for the type). Basic formatting is provided via one_value_per_line_trailing_separator (a formatter) or one_value_per_line_formatter (a formatter generator). :param field_name: The name of the field. :param separator_token: One of SPACE_SEPARATOR and COMMA_SEPARATOR :param token_iter: An iterable of tokens to be formatted. The following example shows how to define a formatter_callback along with a few verifications. >>> fmt_field_len_sep = one_value_per_line_trailing_separator >>> fmt_shortest = one_value_per_line_formatter( ... 1, ... trailing_separator=False ... ) >>> fmt_newline_first = one_value_per_line_formatter( ... 1, ... trailing_separator=False, ... immediate_empty_line=True ... ) >>> # Omit separator tokens for in the token list for simplicity (the formatter does >>> # not use them, and it enables us to keep the example simple by reusing the list) >>> tokens = [ ... FormatterContentToken.value_token("foo"), ... FormatterContentToken.comment_token("# some comment about bar\n"), ... FormatterContentToken.value_token("bar"), ... ] >>> # Starting with fmt_dl_ts >>> print(format_field(fmt_field_len_sep, "Depends", COMMA_SEPARATOR_FT, tokens), end='') Depends: foo, # some comment about bar bar, >>> print(format_field(fmt_field_len_sep, "Architecture", SPACE_SEPARATOR_FT, tokens), end='') Architecture: foo # some comment about bar bar >>> # Control check for the special case where the field starts with a comment >>> print(format_field(fmt_field_len_sep, "Depends", COMMA_SEPARATOR_FT, tokens[1:]), end='') Depends: # some comment about bar bar, >>> # Also, check single line values (to ensure it ends on a newline) >>> print(format_field(fmt_field_len_sep, "Depends", COMMA_SEPARATOR_FT, tokens[2:]), end='') Depends: bar, >>> ### Changing format to the shortest length >>> print(format_field(fmt_shortest, "Depends", COMMA_SEPARATOR_FT, tokens), end='') Depends: foo, # some comment about bar bar >>> print(format_field(fmt_shortest, "Architecture", SPACE_SEPARATOR_FT, tokens), end='') Architecture: foo # some comment about bar bar >>> # Control check for the special case where the field starts with a comment >>> print(format_field(fmt_shortest, "Depends", COMMA_SEPARATOR_FT, tokens[1:]), end='') Depends: # some comment about bar bar >>> # Also, check single line values (to ensure it ends on a newline) >>> print(format_field(fmt_shortest, "Depends", COMMA_SEPARATOR_FT, tokens[2:]), end='') Depends: bar >>> ### Changing format to the newline first format >>> print(format_field(fmt_newline_first, "Depends", COMMA_SEPARATOR_FT, tokens), end='') Depends: foo, # some comment about bar bar >>> print(format_field(fmt_newline_first, "Architecture", SPACE_SEPARATOR_FT, tokens), end='') Architecture: foo # some comment about bar bar >>> # Control check for the special case where the field starts with a comment >>> print(format_field(fmt_newline_first, "Depends", COMMA_SEPARATOR_FT, tokens[1:]), end='') Depends: # some comment about bar bar >>> # Also, check single line values (to ensure it ends on a newline) >>> print(format_field(fmt_newline_first, "Depends", COMMA_SEPARATOR_FT, tokens[2:]), end='') Depends: bar :Fz9Invalid token_iter: Field values cannot end with commentsz:Bad format: Comments must appear directly after a newline.#z(Invalid Comment token: Must start with #rKz,Invalid Comment token: Must end on a newlinerz:Invalid Value token: It cannot start nor end on whitespacez,Bad format: Missing continuation line markerz)Bad format: Formatter omitted a separator) rKz&Bad format: Saw completely empty line.z1Bad format: The field value must end on a newline) rlistrrstrr startswithendswithrr5appendjoin) formatter field_namer+ token_iterformatted_tokensjust_after_newlinelast_was_value_token last_tokentoken token_as_textformatted_texts r format_fieldrssbj#C( *d##Z^   ZXYY Y: CC::E e2 3 3 ) R)c$%abbb$//44Q$%OPPP$--d33U$%STTTU R #++--cr1B1J1J1L1Lc$%abbb%U$%STTT'R$%PQQQ#(> #(  KQ<// !IJJJ #++-- Km6N6Ns6S6S K !IJJJ ...*33D99WW-..N  " "4 ( (NLMMM r)TF)rMdebian._deb822_repro._utilrdebian._deb822_repro.tokensrr/r-r)typingrr r debian._deb822_repro.typesr r ImportErrorobjectrr'r(r\%one_value_per_line_trailing_separatorrsrErrr{se888888333333!$( //////////LLLLLLLLL   D RcRcRcRcRcFRcRcRcj+*30GHH**30GHH  "????D)E(D)))% `````s +33