Build: #5 failed

Job: Test Many Linux 2.28 failed

Stages & jobs

  1. Default Stage

e2e6 1 00010 s uid a002 xd0a588 x2239 regression: Test case result

The below summarizes the result of the test " e2e6 1 00010 s uid a002 xd0a588 x2239 regression" in build 5 of Pipeline - Pipeline Main with Casa 6.6.6 test - cvpost - release-6.6.6 - Test Many Linux 2.28. View test case history
Description
e2e6 1 00010 s uid a002 xd0a588 x2239 regression
Test class
pipeline.infrastructure.utils.regression-tester
Method
test_E2E6_1_00010_S__uid___A002_Xd0a588_X2239_regression
Duration
118 mins
Status
Failed (Existing Failure)

Error Log

Failed: Failed to match 4 result values within tolerances :
s17.hifa_gfluxscale.uid___A002_Xd0a588_X2239.field_1.spw_14.I
	values differ by > a relative difference of 1e-07
	expected: 0.4236040603669049
	new:      0.4236036086183519
	diff: 4.517485530097787e-07
	percent_diff: 0.00010664405639041712%
s17.hifa_gfluxscale.uid___A002_Xd0a588_X2239.field_2.spw_14.I
	values differ by > a relative difference of 1e-07
	expected: 0.13715885648392986
	new:      0.1371580659534747
	diff: 7.905304551525383e-07
	percent_diff: 0.0005763612175092466%
s21.hif_applycal.uid___A002_Xd0a588_X2239.qa.metric.gt90deg_offset_phase_vs_freqintercept
	values differ by > a relative difference of 1e-07
	expected: 93.90089072481301
	new:      94.57674226874364
	diff: -0.6758515439306336
	percent_diff: -0.7197498753353593%
s21.hif_applycal.uid___A002_Xd0a588_X2239.qa.metric.phase_vs_freqintercept
	values differ by > a relative difference of 1e-07
	expected: 69.88812832285063
	new:      69.84503367396479
	diff: 0.043094648885841025
	percent_diff: 0.061662330813559346%
Worst absolute diff, s21.hif_applycal.uid___A002_Xd0a588_X2239.qa.metric.gt90deg_offset_phase_vs_freqintercept: -0.6758515439306336
Worst percentage diff, s21.hif_applycal.uid___A002_Xd0a588_X2239.qa.metric.gt90deg_offset_phase_vs_freqintercept: -0.7197498753353593%
@pytest.mark.fast
    @pytest.mark.alma
    def test_E2E6_1_00010_S__uid___A002_Xd0a588_X2239_regression():
        """Run ALMA cal+image regression on a 12m moderate-size test dataset in ASDM.
    
        Recipe name:                procedure_hifa_calimage
        Dataset:                    E2E6.1.00010.S: uid___A002_Xd0a588_X2239
        """
    
        input_dir = 'pl-regressiontest/E2E6.1.00010.S'
        ref_directory = 'pl-regressiontest/E2E6.1.00010.S'
    
        pr = PipelineRegression(recipe='procedure_hifa_calimage.xml',
                                input_dir=input_dir,
                                visname=['uid___A002_Xd0a588_X2239'],
                                expectedoutput_dir=ref_directory)
    
>       pr.run()

pipeline/infrastructure/utils/regression-tester.py:416: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pipeline/infrastructure/utils/regression-tester.py:224: in run
    self.__compare_results(new_file, default_relative_tolerance)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <pipeline.infrastructure.utils.regression-tester.PipelineRegression object at 0x7fc37edbf220>
new_file = 'uid___A002_Xd0a588_X2239.NEW.results.txt'
relative_tolerance = 1e-07

    def __compare_results(self, new_file: str, relative_tolerance: float):
        """
        Compare results between new one loaded from file and old one.
    
        Args:
            new_file : file path of new results
            relative_tolerance : relative tolerance of output value
        """
        with open(self.expectedoutput_file) as expected_fd, open(new_file) as new_fd:
            expected_results = expected_fd.readlines()
            new_results = new_fd.readlines()
            errors = []
            worst_diff = (0, 0)
            worst_percent_diff = (0, 0)
            for old, new in zip(expected_results, new_results):
                try:
                    oldkey, oldval, tol = self.__sanitize_regression_string(old)
                    newkey, newval, _ = self.__sanitize_regression_string(new)
                except ValueError as e:
                    errorstr = "The results: {0} could not be parsed. Error: {1}".format(new, str(e))
                    errors.append(errorstr)
                    continue
    
                assert oldkey == newkey
                tolerance = tol if tol else relative_tolerance
                if newval is not None:
                    LOG.info(f'Comparing {oldval} to {newval} with a rel. tolerance of {tolerance}')
                    if oldval != pytest.approx(newval, rel=tolerance):
                        diff = oldval-newval
                        percent_diff = (oldval-newval)/oldval * 100
                        if abs(diff) > abs(worst_diff[0]):
                            worst_diff = diff, oldkey
                        if abs(percent_diff) > abs(worst_percent_diff[0]):
                            worst_percent_diff = percent_diff, oldkey
                        errorstr = f"{oldkey}\n\tvalues differ by > a relative difference of {tolerance}\n\texpected: {oldval}\n\tnew:      {newval}\n\tdiff: {diff}\n\tpercent_diff: {percent_diff}%"
                        errors.append(errorstr)
                elif oldval is not None:
                    # If only the new value is None, fail
                    errorstr = f"{oldkey}\n\tvalue is None\n\texpected: {oldval}\n\tnew:      {newval}"
                    errors.append(errorstr)
                else:
                    # If old and new values are both None, this is expected, so pass
                    LOG.info(f'Comparing {oldval} and {newval}... both values are None.')
    
            [LOG.warning(x) for x in errors]
            n_errors = len(errors)
            if n_errors > 0:
                summary_str = f"Worst absolute diff, {worst_diff[1]}: {worst_diff[0]}\nWorst percentage diff, {worst_percent_diff[1]}: {worst_percent_diff[0]}%"
                errors.append(summary_str)
>               pytest.fail("Failed to match {0} result value{1} within tolerance{1} :\n{2}".format(
                    n_errors, '' if n_errors == 1 else 's', '\n'.join(errors)), pytrace=True)
E               Failed: Failed to match 4 result values within tolerances :
E               s17.hifa_gfluxscale.uid___A002_Xd0a588_X2239.field_1.spw_14.I
E               	values differ by > a relative difference of 1e-07
E               	expected: 0.4236040603669049
E               	new:      0.4236036086183519
E               	diff: 4.517485530097787e-07
E               	percent_diff: 0.00010664405639041712%
E               s17.hifa_gfluxscale.uid___A002_Xd0a588_X2239.field_2.spw_14.I
E               	values differ by > a relative difference of 1e-07
E               	expected: 0.13715885648392986
E               	new:      0.1371580659534747
E               	diff: 7.905304551525383e-07
E               	percent_diff: 0.0005763612175092466%
E               s21.hif_applycal.uid___A002_Xd0a588_X2239.qa.metric.gt90deg_offset_phase_vs_freqintercept
E               	values differ by > a relative difference of 1e-07
E               	expected: 93.90089072481301
E               	new:      94.57674226874364
E               	diff: -0.6758515439306336
E               	percent_diff: -0.7197498753353593%
E               s21.hif_applycal.uid___A002_Xd0a588_X2239.qa.metric.phase_vs_freqintercept
E               	values differ by > a relative difference of 1e-07
E               	expected: 69.88812832285063
E               	new:      69.84503367396479
E               	diff: 0.043094648885841025
E               	percent_diff: 0.061662330813559346%
E               Worst absolute diff, s21.hif_applycal.uid___A002_Xd0a588_X2239.qa.metric.gt90deg_offset_phase_vs_freqintercept: -0.6758515439306336
E               Worst percentage diff, s21.hif_applycal.uid___A002_Xd0a588_X2239.qa.metric.gt90deg_offset_phase_vs_freqintercept: -0.7197498753353593%

pipeline/infrastructure/utils/regression-tester.py:290: Failed