xls2xlsx package

Submodules

xls2xlsx.cli module

Console script for xls2xlsx.

xls2xlsx.cli.main()[source]

Console script for xls2xlsx.

xls2xlsx.htmlxls2xlsx module

class xls2xlsx.htmlxls2xlsx.CSSStyle[source]

Bases: object

DEFAULT_CELL_WIDTH_PX = 64
DEFAULT_POINT_SIZE = 10
MAX_CELL_HEIGHT_PT = 409
MAX_CELL_WIDTH_UNITS = 255
MIN_CELL_HEIGHT_PT = 11.25
MIN_CELL_WIDTH_PX = 15
RETRIES = 6
SPREADSHEET_HEIGHT_PX = 800
SPREADSHEET_WIDTH_PX = 1900
add_style_sheet(style)[source]
apply_style(tags)[source]

Apply the appropriate style to the given element(s). The tags is a list of tuples of (tag_name, tag_attrs), given in the proper nested order.

Handled: * .class #id element element.class element,element (handled above) element element (direct descendants only) element>element [attribute] [attribute=value]

Not handled: .class1.class2 .class1 .class2 element+element element~element

static css_escape_unicode(s)[source]
static fixup_excel_width(wid)[source]
static format_style(style)[source]

Convert a parsed style dict back to a style string

static get_pt(item, default=0.0, default_units='px', spreadsheet_pt=1425.0)[source]

For an item like 0.5pt or 2px, get the float value in units of pt

static get_px(item, default=0.0, spreadsheet_px=1900)[source]

For an item like 0.5pt or 2px, get the float value in units of px

static get_units(item, default=None)[source]

For an item like 0.5pt, return the units = ‘pt’

static get_value(item, default=0.0)[source]

For an item like 0.5pt, get the float value = 0.5

static html_escape_unicode(s)[source]
static join(base, fn)[source]

Join a base URL with a filename or a base local path with a filename

map_font(font)[source]

Map the given font to something that Excel probably has

parse_style(styl)[source]

Parse a style string (e.g. from style=”…”) to a dict

static px_to_units(px)[source]

Convert pixels to excel column width units (determined empirically)

static read(f, mode='', quiet=False, retries=6)[source]

Read from either a URL or a filename or file-like object. If mode is ‘b’, then read in binary. If f is a file-like object and you want it read in unicode with the proper encoding, then open it using ‘rb’ mode and don’t pass a mode to this method. If quiet is True, then return None rather than raising an exception on errors

style_to_xlsx(style, font=None, fill=None, border=None, alignment=None, number_format=None)[source]

Convert this style to the appropriate attributes for openpyxl. returns a tuple with (font, fill, border, alignment)

static to_xlsx_color(color)[source]
static units_to_px(units)[source]

Convert Excel column width units to pixels

update_style(style, new_style, new_tag=None, parent=False)[source]

Like normal dict.update() but handle relative size fonts (% and em), inherit, and initial. The new_tag specifies the tag name associated with the new_style and is used to process values of “initial”. If parent is True (default), then the “style” element is the parent style of “new_style”, which determines if properties are inherited or just merged.

class xls2xlsx.htmlxls2xlsx.FontUtils[source]

Bases: object

LINE_HEIGHT_FACTOR = 1.25
get_font[source]
get_font_path(name, bold=False, italic=False)[source]

Given a font name (which may not be in the proper case), return the path to the font file or None if not found

get_font_size(font, s)[source]

Get the width and height of text ‘s’ in the given font. ‘s’ is a single line of text (without newlines)

get_real_font_name(name, bold=False, italic=False)[source]

Given a font name which may not be in the proper case, return the real font name, or None if not found

lines_needed(img_width, s, font)[source]

How many lines are needed to render this text ‘s’ using img_width pixels?

static pt_to_px(pt)[source]
static px_to_pt(px)[source]
static str_to_filename(s, ext)[source]

Convert this string to a valid filename (used for debugging only)

class xls2xlsx.htmlxls2xlsx.HTMLXLS2XLSX(f, dirname='.')[source]

Bases: object

Convert an xls file with html contents into and xlsx file

to_xlsx(filename=None, workbook=None, worksheet=None, sheet_name=None)[source]

Convert to xlsx using openpyxl. If filename is not None, then the result is written to that file, and the filename is returned, else the workbook is returned. If workbook is passed, then the worksheet is written to the given workbook

xls2xlsx.utils module

xls2xlsx.utils.is_date_format(fmt)[source]
xls2xlsx.utils.perform_number_format(value, number_format)[source]

This is a half-baked attempt at formatting the given value using the given Excel number_format. This is used by the tests to match values. Handled is many of the formats for numbers (int/float), datetime, date, time, and timedelta.

xls2xlsx.xls2xlsx module

class xls2xlsx.xls2xlsx.XLS2XLSX(f, dirname='.', ignore_workbook_corruption=False)[source]

Bases: object

Convert an xls file into an xlsx file. Everything is supported except for the things not supported by xlrd, which include:

  1. Conditional Formatting
  2. Formulas
  3. Charts and Images
  4. Pivot Tables

If this xls file is in html format, then we call HTMLXLS2XLSX to convert it.

static read(f, retries=6)[source]

Read from either a URL or a filename or file-like object.

to_xlsx(filename=None)[source]

Convert to xlsx using openpyxl. If filename is not None, then the result is written to that file, and the filename is returned, else the workbook is returned.

xls_color_to_xlsx(color_ndx)[source]
xls_date_to_xlsx(value)[source]
xls_height_to_xlsx(height)[source]
xls_style_to_xlsx(xf_ndx)[source]

Convert an xls xf_ndx into a 6-tuple of styles for xlsx

xls_width_to_xlsx(width)[source]

Module contents

Top-level package for xls2xlsx.