Skip to content

Conversation

@djsmith17
Copy link
Contributor

@djsmith17 djsmith17 commented Oct 22, 2025

This pull request includes a range of improvements and refactoring across several modules, focusing on code clarity, maintainability, and efficiency. The main changes involve cleaning up and optimizing phenotyping logic, enhancing logging and diagnostics in the main web app, improving subprocess handling for package installation, and updating documentation and comments for better readability.

Phenotyping logic improvements:

  • Refactored the init_pheno_cols, init_pheno_assign, and init_pheno_summ functions in basic_phenotyper_lib.py for more efficient and vectorized operations, replacing manual loops and deprecated code with pandas-native methods. This includes a new, faster implementation for generating species_name_short and calculating counts and percentages. [1] [2] [3]

Logging and diagnostics enhancements:

  • Added and configured a logging system in Multiplex_Analysis_Web_Apps.py to provide informative runtime messages for key operations such as platform initialization, directory creation, session state initialization, and benchmarking actions. [1] [2] [3] [4] [5]

Subprocess and package installation robustness:

  • Updated all subprocess calls in install_missing_packages.py to explicitly set check=False for better error handling and clarity, and improved code formatting for readability. [1] [2] [3] [4]

Documentation and code clarity:

  • Improved docstrings and comments in nidap_io.py for functions interacting with NIDAP, clarifying argument types, return values, and usage examples. [1] [2] [3]
  • Minor typo and comment fixes in nidap_dashboard_lib.py to improve code readability. [1] [2]

Module import and structure updates:

  • Changed imports in Multiplex_Analysis_Web_Apps.py from pages2 to pages, removed the live package installation call, and updated sidebar UI elements for clarity. [1] [2]

These changes collectively improve performance, maintainability, and user/developer experience across the codebase.This pull request refactors the codebase to improve maintainability, logging, and usability. The main changes include moving modules from the pages2 directory to pages, enhancing logging and user feedback in the main app, and improving subprocess handling in package installation scripts. Several modules also received docstrings and code cleanup for clarity.

Module refactoring and imports:

  • All imports previously referencing pages2 are updated to use the new pages directory, and associated files are renamed accordingly. This standardizes module organization and simplifies future maintenance. [1] [2] [3] [4]

Logging and user feedback improvements:

  • Introduced Python's logging module in Multiplex_Analysis_Web_Apps.py for better tracking of application events, such as directory creation, session state initialization, and sidebar actions. Logging statements now provide real-time feedback for key operations. [1] [2] [3]
  • Minor UI improvement: updated sidebar documentation link to use a book emoji for better readability.

Subprocess and installation script enhancements:

  • Updated all subprocess.run calls in install_missing_packages.py to explicitly set check=False for more robust error handling and readability. [1] [2] [3] [4]

Code documentation and cleanup:

  • Added module-level docstrings and function docstrings to several files (e.g., datafile_format_unifier.py), clarifying their purpose and usage for future developers. [1] [2] [3]
  • Minor code cleanup, such as removing unnecessary blank lines and reordering imports for clarity. [1] [2]

Main function usability:

  • Added a docstring to the main() function in Multiplex_Analysis_Web_Apps.py to clarify its role as the entry point for the application.

@djsmith17 djsmith17 self-assigned this Oct 22, 2025
For legibility and for CI/CD
Previously a bunch of UI elements were removed from the Data Import and Export page that were not necessary for local use/development. I am adding them back in for a few reasons.

- It makes development of these widgets easier when developing locally
- It will force us to consider how this design will need to change when we move from the NIDAP foundry platform to the next version.

I am also populating the list of files in the ./input folder to better test a nother festure. We can undo line 281 later if its confusing
Save this for later. I need to stop working on this for now, but I will be back
The column starting width can sometimes be way too large. This is an attempt to regulate that size so that its easier to read the tables in the Data Import and Export Page
This is mostly the same just with some added f strings and indentation. Following co-pilot and this indentation looked good.
Much more straight forward. Wish I had thought of it
How far we have come with making MAWA a more effecient tool
Made this more efficient and I cut this down by 40s for a 21M cell detaframe
Replaces the custom species_name_long_to_short function with a pandas str.extractall and groupby approach to generate the species_name_short column. The new method simplifies the code and leverages pandas string operations for improved clarity and maintainability.
@djsmith17 djsmith17 added bug Something isn't working dependencies Pull requests that update a dependency file labels Oct 23, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request refactors and modernizes the codebase to improve maintainability, performance, and developer experience. The primary change is reorganizing modules from pages2 to pages, with additional enhancements to logging infrastructure, code documentation, subprocess handling, and phenotyping performance.

Key Changes:

  • Module reorganization from pages2 to pages directory across all imports
  • Introduction of structured logging for application events and diagnostics
  • Performance optimizations in phenotyping logic using vectorized pandas operations
  • Enhanced documentation with comprehensive docstrings and parameter descriptions

Reviewed Changes

Copilot reviewed 15 out of 40 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
Multiplex_Analysis_Web_Apps.py Updated imports from pages2 to pages, added logging infrastructure, improved main function documentation
streamlit_session_state_management.py Updated import path from pages2 to pages
streamlit_dataframe_editor.py Enhanced docstrings with detailed parameter descriptions, improved code formatting
platform_io.py Added comprehensive docstrings, improved code organization, added column configuration for dataframes
pages2/skeleton.py File removed (relocated to pages directory)
pages/skeleton.py Cleaned up comments for clarity
pages/thresholded_phenotyping.py Changed expander to form for marker selection, added benchmarking timers
pages/spatial_umap_prediction_app.py Updated import from pages2 to pages
pages/datafile_format_unifier.py Added module and function docstrings
pages/child_process_killer.py Added module docstring, reorganized imports
pages/adaptive_phenotyping.py Updated import from pages2 to pages, removed extra blank line
nidap_io.py Improved docstrings with detailed parameter and return value documentation
nidap_dashboard_lib.py Fixed typo in comment ("BUttons" to "Buttons"), removed unused import
install_missing_packages.py Added explicit check=False to all subprocess calls, improved code formatting
basic_phenotyper_lib.py Refactored phenotyping functions for performance using vectorized pandas operations

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

# # return 'Other'
# # return ' + '.join(marker_names[i] for i in marker_indices) + '+'
# df['species_name_short'] = df['species_name_long'].apply(species_name_long_to_short)
df['species_name_short'] = df['species_name_long'].str.extractall(r'(\w+)\+').groupby(level=0).agg(' + '.join).fillna('Other') + '+'
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new vectorized implementation for 'species_name_short' changes the output format compared to the commented-out function. The original function returns 'Other' for cases with no positive markers, but the new implementation returns 'Other+'. Additionally, cells with positive markers will get an extra '+' at the end. For example, 'ECAD+ COX2+' in the old version will become 'ECAD + COX2++' in the new version. Consider using: df['species_name_short'] = df['species_name_long'].str.extractall(r'(\w+)\+').groupby(level=0).agg(' + '.join).fillna('Other').apply(lambda x: x if x == 'Other' else x + '+')

Suggested change
df['species_name_short'] = df['species_name_long'].str.extractall(r'(\w+)\+').groupby(level=0).agg(' + '.join).fillna('Other') + '+'
df['species_name_short'] = df['species_name_long'].str.extractall(r'(\w+)\+').groupby(level=0).agg(' + '.join).fillna('Other').apply(lambda x: x if x == 'Other' else x + '+')

Copilot uses AI. Check for mistakes.
Comment on lines +175 to +182
with st.form('Choose Markers to include'):
st.multiselect('Markers', options = st.session_state.loaded_marker_names,
key = 'marker_multi_sel',
on_change=marker_multiselect_callback)
key = 'marker_multi_sel')

# Every form must have a submit button.
submitted = st.form_submit_button('Apply Changes')
if submitted:
marker_multiselect_callback()
Copy link

Copilot AI Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The form ID 'Choose Markers to include' contains spaces which could cause issues with Streamlit's internal form handling. Form identifiers should follow a consistent naming convention without spaces. Consider using a key like 'marker_selection_form' instead.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants