imzml_writer

imzml_writer.imzML_Writer

imzML_Writer.imzML_Writer.gui(tgt_dir=None)[source]

Typical launch will appear as:

import imzml_writer.imzML_Writer as iw

##Launch with no target directory - navigate with UI
iw.gui()

##Launch with target directory to open directory
path = "/path/to/your/imzML/files/example.imzML"
iw.gui(path)

For detailed instructions on navigating the GUI, see the user guide.

imzml_writer.imzML_Scout

imzml_writer.imzML_Scout.main(tgt_file='', initial_mz=104.1070)[source]

Main control loop for imzML Scout GUI. Callable either with no arguments (find file via GUI) or by passing the file path to the target imzML.

Parameters:

tgt_file (str) – Path to imzML file for visualization.

Typically called during normal operation of imzML_Writer, but call also be called directly:

import imzml_writer.imzML_Scout as scout

##Call with no arguments opens it empty and you can use the GUI to search for your file
scout.main()

##Call with full or relative pathing to the imzML will open the specified file
path_to_imzML = "/Example/File/path/my_image.imzML"
scout.main(path_to_imzML)

For detailed instructions on navigating the GUI, see the user guide.

imzml_writer.ms_convert_gui

imzml_writer.ms_convert_gui.main(tgt_dir=None)[source]

Experimental - Provides a Mac GUI for MSConvert as a wrapper around the msconvert Docker image.

Parameters:

tgt_dir (str) – (optional) Initial directory for the GUI to open in.

imzml_writer.utils

imzml_writer.utils.Check_Docker_Image()[source]

Tests that docker is available, prompts the user to update/install if available

imzml_writer.utils.RAW_to_mzML(path, write_mode='Centroid', combine_ion_mobility=False)[source]

Calls msConvert via docker on linux and Mac, or calls viaPwiz method on PC to manage conversion of raw vendor files to mzML format within the specified path

Parameters:
  • path (str) – path to files containing raw instrument data.

  • write_mode (str) – Write mode for msconvert - ‘Profile’ or ‘Centroid’.

  • combine_ion_mobility (bool) – Whether or not to hand –combineIonMobilitySpectra flag to msconvert (default false)

imzml_writer.utils.alphanum_key(s)[source]

Part of the human sortable collection of functions borrowed from http://nedbatchelder.com/blog/200712/human_sorting.html. Turn a string into a list of string and number chunks.

Parameters:

s (str) – String to be chunked out

Return type:

list

Returns:

List of string/number chunks

imzml_writer.utils.annotate_from_model_imzML(model, to_annotate)[source]

Annotates an imzML file based on an example file - intended for conjunction with write_masked_imzML to preserve metadata

imzml_writer.utils.annotate_imzML(annotate_file, SRC_mzML, scan_time=0.001, filter_string='none given', x_speed=1, y_step=1, polarity='positive', ms_level=1, scan_mode='x-scan')[source]

Takes pyimzml output imzML files and annotates them using GUI inputs and the corresponding mzML source file, then cleans up errors in the imzML structure for compatibility with imzML viewers/processors.

Parameters:
  • annotate_file (str) – the imzML file to be annotated

  • SRC_mzML (str) – the source file to pull metadata from

  • scan_time (float) – The total time required to scan across the imaging area at speed x_speed (mins)

  • filter_string (str) – what scan filter is actually captured (default = “none given”)

  • x_speed (float) – The scan speed across the imaging area during linescans (µm/s)

  • y_step (float) – The distance between adjacent strip lines across the imaging area (µm/s)

  • scan_mode (str) – Whether the data was acquired in ‘x-scan’ or ‘y-scan’ mode.

imzml_writer.utils.autofind_msconvert()[source]

Finds msconvert by searching all available drives, verifies success by calling info of msconvert

Returns:

Full path to msconvert.exe

imzml_writer.utils.check_msconvert()[source]

Checks that msconvert is available for the current python environment - returns msconvert path/callable

imzml_writer.utils.clean_raw_files(path, file_type)[source]

Cleans up file system after RAW_to_mzML has completed, creating two folders within the specified path:

Initial RAW files - raw vendor files

Output mzML Files - processed mzML files output by msConvert

Parameters:
  • path (str) – path to directory to clean up

  • file_type (str) – extension for raw vendor data to place into raw file directory

imzml_writer.utils.find_file(target, folder)[source]

Recursely searches the folder for the target file - helps find msconvert in cases where it isn’t specified in the path.

Parameters:
  • target (str) – Target file as a string

  • folder (str) – Top-level folder to search through

Returns:

full path to file if found, [ ] if not present

imzml_writer.utils.get_drives()[source]

On windows machines, retrieves the accessible drives (e.g C:, D:, etc.) in to for automated seeking of msconvert.

Returns:

Available drives, as a list of strings.

imzml_writer.utils.get_file_type(path)[source]

Identifies the most abundant file type in the specified path, ignoring hidden files.

Parameters:

path (str) – path to files specified as a string.

Returns:

Most abundant file extension in path

imzml_writer.utils.get_final_scan_time(run)[source]

Returns the final scan time from the specified mzML

Parameters:

run (Reader) – pymzml reader object

Return scan_time:

Scan time in minutes (float)

imzml_writer.utils.human_sort(l)[source]

Part of the human sortable collection of functions borrowed from http://nedbatchelder.com/blog/200712/human_sorting.html. Sorts a list in the way that humans expect.

Parameters:

l (list) – List to be sorted in a human-intuitive wave.

Return type:

list

Returns:

Sorted list.

imzml_writer.utils.imzML_metadata_process(model_files, x_speed, y_step, path, tgt_progress=None, scan_mode='x-scan')[source]

Manages annotation of imzML files with metadata from source mzML files and user-specified fields (GUI).

Parameters:
  • model_files (str) – Directory to the folder containing mzML files

  • x_speed (float) – scan speed in the x-direction, µm/sec

  • y_step (float) – step between strip lines, µm

  • path (str) – path to the directory where imzML files should be stored after annotation

  • tgt_progress – Tkinter progress bar object to update as the process continues

  • scan_mode (str) – Whether the data was acquired in ‘x-scan’ or ‘y-scan’ mode.

imzml_writer.utils.move_files(probe_txt, path)[source]

Moves files matching a search string (probe_txt) in the current working directory into the specified directory in a new folder called ‘probe_txt’

Parameters:
  • probe_txt (str) – The search string to find in the current directory.

  • path (str) – The target directory to move files to

imzml_writer.utils.msconvert_searchUI()[source]

Launches a dialog window to ask the user whether to search manually or automatically for msconvert install path

Return type:

str

Returns:

Specified mode to search for msconvert (“auto” or “manual”)

imzml_writer.utils.mzML_to_imzML_convert(progress_target=None, PATH=os.getcwd(), LOCK_MASS=0, TOLERANCE=20, zero_indexed=False, no_duplicating=False, scan_mode='x-scan')[source]

Handles conversion of mzML files to the imzML format using the pyimzml library. Converts data line-by-line (one mzML at a time), aligning data based on scan time and splitting into separate imzML files for each scan in the source mzML.

Parameters:
  • progress_target – tkinter progress bar object from the GUI to update as conversion progresses

  • PATH (str) –

    • Working path for source mzML files

  • LOCK_MASS (float) –

    • m/z to use for coarse m/z recalibration if desired. 0 = No recalibration

  • TOLERANCE (float) – Search tolerance (in ppm) with which to correct m/z based on the specified lock mass. Default 20 ppm

  • zero_indexed (bool) – Specifies whether pixel dimensions should start from 1 (default - False) or 0 (True)

  • no_duplicating (bool) – Specifies whether spectra can be duplicated into adjacent pixels for sparsely sampled lines. Default True

  • scan_mode (str) – Whether the data was acquired in ‘x-scan’ or ‘y-scan’ mode.

imzml_writer.utils.tryint(s)[source]

Part of the human sorting collection of functions borrowed from http://nedbatchelder.com/blog/200712/human_sorting.html. Returns an int if possible, or s unchanged.

Parameters:

s – Trial variable to test if it can be converted to an integer

Returns:

integer if convertible, s if not.

imzml_writer.utils.viaPWIZ(path, write_mode, combine_ion_mobility)[source]

Method to call msconvert directly if the detected platform is on windows. Converts all target files in the path to mzML in the specified mode.

Parameters:
  • path (str) – path to the target files

  • write_mode (str) – “Centroid” or “Profile” modes

  • combine_ion_mobility (bool) – Whether or not –combineIonMobility flag is passed to msconvert

Returns:

None

imzml_writer.utils.write_masked_imzML(source_file, roi_mask, save_dir=None)[source]

Rewrites the specified imzML file as masked by the ROI, useful to truncate out only the tissue for reduced file sizes

Parameters:
  • source_file (str) – path to the source imzML

  • roi_mask (array) – numpy array matching dimensions of source file, where 0 indicates an excluded pixel and 1 is an included pixel

  • save_dir (str) – Where to save the resulting imzML, defaults to the same directory as the source file

Return type:

str

imzml_writer.recalibrate_mz

imzml_writer.recalibrate_mz.recalibrate(mz, int, lock_mz, search_tol, ppm_off=0)[source]

Performs a coarse m/z recalibration based on shifting a lock mass back to target, and everything else by the same ppm shift. Applies correction based on the highest m/z peak within the search tolerance.

Parameters:
  • mz (list) – List of mz values in the spectrum to recalibrate

  • int (list) – Corresponding list of intensities

  • lock_mz (float) – Target lock mass to calibrate to - should be in the majority/all spectra

  • search_tol (float) – Tolerance with which to search for the lock mass (ppm)

  • ppm_off (float) – Optional argument specifying the previous/typical ppm error, applied if the lock mass cannot be found (default = 0, no correction)

Return recalibrated_mz:

Recalibrated mz if applicable, based on either the ppm error to the lock mass or optional ppm_off argument

Return ppm_off:

Applied correction in ppm

imzml_writer.analyte_list_cleanup

imzml_writer.analyte_list_cleanup.check_column_order(input_data)[source]

Takes a pandas dataframe of mz and names and checks that they’re in the order assumed by imzML_Scout (name, mz). If not, reorganizes columns to match expected order.

Parameters:

input_data (DataFrame) – Pandas dataframe containing mz and names of targets

Returns:

same dataframe with columns ordered [name, mz]

imzml_writer.analyte_list_cleanup.check_headers(input_data, path)[source]

Takes a pandas dataframe of mz and name and checks if the headers are missing - taken as a header being convertible to a integer (i.e. an mz value in header). If headers are missing, it rereads the sheet specified at [path] with no headers and manually inserts them.

Parameters:
  • input_data (DataFrame) – Pandas dataframe with columns of mz and names

  • path (str) – Absolute or relative path specified as a string

Returns:

pandas dataframe of mz and names with header inserted, if needed

imzml_writer.analyte_list_cleanup.cleanup_table(input_data, path)[source]

Takes pandas dataframe of columns [mz, name] or vice versa and sanitizes it for imzML scout by:

  1. Making sure headers are consistent with expected (presence) - check_headers()

  2. Makes sure orders are expected (name then mz) - check_column_order()

  3. Cleans up any incompatible characters in the same that will prevent file saving - name_cleanup()

This allows users to specify ‘messy’ excel sheets for bulk export without imzML Scout failing.

Parameters:
  • input_data (DataFrame) – Pandas dataframe of input mz and name

  • path (str) – Path to the corresponding excel sheet, unless it needs to be reread to omit headers

Returns:

Sanitized pandas dataframe of mz and names compatible with image/csv export of imzML Scout.

imzml_writer.analyte_list_cleanup.name_cleanup(input_data)[source]

Takes a pandas dataframe of form name, mz and reads the first column (names) replacing ‘dangerous’ characters with ‘_’ to ensure safe storage.

Parameters:

input_data (DataFrame) – Pandas dataframe of form [name, mz]

Returns:

Pandas dataframe with trouble characters removed.