The function and parameters in Slim-TPCA package

List of function:

The list of function can be found using:

The function will be explained in the following text.

preproc(table, ref_col=1):

Calculates soluble fraction at each temperature.

Parameters:

table : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures

ref_col : Reference column index for conversion to soluble fraction

Return:

A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures

dist(table, ref_col=1, method=’cityblock’):

Calculates distance between every two proteins.

Parameters:

table : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures

ref_col : Reference column index for conversion to soluble fraction

method : Distance calculation method

Return:

A two-dimensional matrix table where the data in columns i,j represent the distance between protein i and protein j

pair_found(table, pair_table, ref_col=1):

Based on the protein pair interaction Database, look for protein pairs where both proteins appear in the data.

Parameters:

table : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures

pair_table : A table with the first protein and the second protein of each protein pair in the first two columns

ref_col : Reference column index for conversion to soluble fraction

Return:

A DataFrame table containing protein pairs where both proteins appear in the data

roc(table, pair_table, ref_col=1, method=’cityblock’):

Calculate parameters of the ROC curve.

Parameters:

table : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures

pair_table : A table with the first protein and the second protein of each protein pair in the first two columns

ref_col : Reference column index for conversion to soluble fraction

method : Distance calculation method

Return:

Three parameters of the ROC curve

roc_plot(table, pair_table, ref_col=1, method=’cityblock’):

Draw ROC plot based on parameters.

Parameters:

table : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures

pair_table : A table with the first protein and the second protein of each protein pair in the first two columns

ref_col : Reference column index for conversion to soluble fraction

method : Distance calculation method

Return:

Return None

complex_found(table, complex_table, ref_col=1):

Look for complexes that meet the requirements of the analysis.

Parameters:

table : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures

complex_table : A DataFrame table with complex-related information, where the subunits (UniProt IDs) column contains the protein IDs in the complex (intervals by;)

ref_col : Reference column index for conversion to soluble fraction

Return:

A table containing the complexes that meet the analysis requirements, with the discovered and undiscovered proteins in columns Subunit_Found and No_Subunit_Found, respectively

complex_dist(table, complex_table, ref_col=1, method=’cityblock’):

Calculate average distance between the subunit proteins of the complex.

Parameters:

table : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures

complex_table : A DataFrame table with complex-related information, where the subunits (UniProt IDs) column contains the protein IDs in the complex (intervals by;)

ref_col : Reference column index for conversion to soluble fraction

method : Distance calculation method

Return:

A table including the average distance and z-score values for each protein complex

random_n(table, complex_table, ref_col=1, method=’cityblock’, samplesize=10000):

Sample virtual random complexes for calculation.

Parameters:

table : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures

complex_table : A DataFrame table with complex-related information, where the subunits (UniProt IDs) column contains the protein IDs in the complex (intervals by;)

ref_col : Reference column index for conversion to soluble fraction

method : Distance calculation method

samplesize : Number of random samples

Return:

One list contain virtual random complexes with the same size are sampled

complex_signature_sample(table, complex_table, ref_col=1, method=’cityblock’, samplesize=10000):

Calculate TPCA signatures of complexes by sampling.

Parameters:

table : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures

complex_table : A DataFrame table with complex-related information, where the subunits (UniProt IDs) column contains the protein IDs in the complex (intervals by;)

ref_col : Reference column index for conversion to soluble fraction

method : Distance calculation method

samplesize : Number of random samples

Return:

Return two table, the first table is a table of random values used to calculate p-values, the second table including p value and z-score values for each protein complex

complex_signature_beta(table, complex_table, ref_col=1, method=’cityblock’, samplesize=500):

Calculate TPCA signatures of complexes by fitting a beta distribution to random complexes.

Parameters:

table : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures

complex_table : A DataFrame table with complex-related information, where the subunits (UniProt IDs) column contains the protein IDs in the complex (intervals by;)

ref_col : Reference column index for conversion to soluble fraction

method : Distance calculation method

samplesize : Number of random samples

Return:

Return two table, the first table is a table of random values used to calculate p-values, the second table including p value and z-score values for each protein complex

align(table_1, table_2, ref_col=1):

Multiple sets of data may identify different proteins and align them here.

Parameters:

table_1 : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures in one status

table_2 : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures in other status

ref_col : Reference column index for conversion to soluble fraction

Return:

Two table for the protein ids of table1 and table2 after alignment

dynamic_complex_absolute_sample(table_1, table_2, complex_table, ref_col=1, method=’cityblock’, samplesize=10000):

Calculate TPCA dynamic modulation signatures of complexes by sampling and absolute distance.

Parameters:

table_1 : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures in one status

table_2 : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures in other status

complex_table : A DataFrame table with complex-related information, where the subunits (UniProt IDs) column contains the protein IDs in the complex (intervals by;)

ref_col : Reference column index for conversion to soluble fraction

method : Distance calculation method

samplesize : Number of random samples

Return:

Return two table, the first table is a table of random values used to calculate p-values, the second tablecontain average Manhattan distance between melting curves among all pairs of subunits of a protein complex in table1(col: Avg_Dist_1) and table2(col: Avg_Dist_2), z-scores value in table1(col: Avg_Dist_Derived_1) and table2(col: Avg_Dist_2), (col: Avg_Dist_Derived_2), Avg Dist relative change the dynamic p-values of the protein complex changes(col: Dynamic_P)

dynamic_complex_relative_sample(table_1, table_2, complex_table, ref_col=1, method=’cityblock’, samplesize=10000):

Calculate TPCA dynamic modulation signatures of complexes by sampling and relative distance.

Parameters:

table_1 : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures in one status

table_2 : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures in other status

complex_table : A DataFrame table with complex-related information, where the subunits (UniProt IDs) column contains the protein IDs in the complex (intervals by;)

ref_col : Reference column index for conversion to soluble fraction

method : Distance calculation method

samplesize : Number of random samples

Return:

Return two table, the first table is a table of random values used to calculate p-values, the second tablecontain average Manhattan distance between melting curves among all pairs of subunits of a protein complex in table1(col: Avg_Dist_1) and table2(col: Avg_Dist_2), z-scores value in table1(col: Avg_Dist_Derived_1) and table2(col: Avg_Dist_2), (col: Avg_Dist_Derived_2), Avg Dist relative change the dynamic p-values of the protein complex changes(col: Dynamic_P)

dynamic_complex_absolute_beta(table_1, table_2, complex_table, ref_col=1, method=’cityblock’, samplesize=500):

Calculate TPCA dynamic modulation signatures of complexes by Beta distribution fitting and absolute distance.

Parameters:

table_1 : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures in one status

table_2 : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures in other status

complex_table : A DataFrame table with complex-related information, where the subunits (UniProt IDs) column contains the protein IDs in the complex (intervals by;)

ref_col : Reference column index for conversion to soluble fraction

method : Distance calculation method

samplesize : Number of random samples

Return:

A DataFrame table contain average Manhattan distance between melting curves among all pairs of subunits of a protein complex in table1(col: Avg_Dist_1) and table2(col: Avg_Dist_2), z-scores value in table1(col: Avg_Dist_Derived_1) and table2(col: Avg_Dist_2), (col: Avg_Dist_Derived_2), Avg Dist relative change the dynamic p-values of the protein complex changes(col: Dynamic_P)

dynamic_complex_relative_beta(table_1, table_2, complex_table, ref_col=1, method=’cityblock’, samplesize=500):

Calculate TPCA dynamic modulation signatures of complexes by Beta distribution fitting and relative distance.

Parameters:

table_1 : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures in one status

table_2 : A DataFrame table containing the soluble concentrations (or intensities) of all proteins at different temperatures in other status

complex_table : A DataFrame table with complex-related information, where the subunits (UniProt IDs) column contains the protein IDs in the complex (intervals by;)

ref_col : Reference column index for conversion to soluble fraction

method : Distance calculation method

samplesize : Number of random samples

Return:

A DataFrame table contain average Manhattan distance between melting curves among all pairs of subunits of a protein complex in table1(col: Avg_Dist_1) and table2(col: Avg_Dist_2), z-scores value in table1(col: Avg_Dist_Derived_1) and table2(col: Avg_Dist_2), (col: Avg_Dist_Derived_2), Avg Dist relative change the dynamic p-values of the protein complex changes(col: Dynamic_P)