Coverage for /wheeldirectory/casa-6.7.0-12-py3.10.el8/lib/py/lib/python3.10/site-packages/casatasks/partition.py: 89%

27 statements  

« prev     ^ index     » next       coverage.py v7.6.4, created at 2024-11-01 07:19 +0000

1##################### generated by xml-casa (v2) from partition.xml ################# 

2##################### 481ec27094e49adbbed45f6cf2c5ce2a ############################## 

3from __future__ import absolute_import 

4import numpy 

5from casatools.typecheck import CasaValidator as _val_ctor 

6_pc = _val_ctor( ) 

7from casatools.coercetype import coerce as _coerce 

8from casatools.errors import create_error_string 

9from .private.task_partition import partition as _partition_t 

10from casatasks.private.task_logging import start_log as _start_log 

11from casatasks.private.task_logging import end_log as _end_log 

12from casatasks.private.task_logging import except_log as _except_log 

13 

14class _partition: 

15 """ 

16 partition ---- Task to produce Multi-MSs using parallelism 

17 

18  

19 Partition is a task to create a Multi-MS out of an MS. General selection 

20 parameters are included, and one or all of the various data columns 

21 (DATA, LAG_DATA and/or FLOAT_DATA, and possibly MODEL_DATA and/or 

22 CORRECTED_DATA) can be selected. 

23  

24 The partition task creates a Multi-MS in parallel, using the CASA MPI framework. 

25 The user should start CASA as follows in order to run it in parallel. 

26  

27 1) Start CASA on a single node with 8 engines. The first engine will be used as the 

28 MPIClient, where the user will see the CASA prompt. All other engines will be used 

29 as MPIServers and will process the data in parallel. 

30 mpicasa -n 8 casa --nogui --log2term 

31 partition(.....) 

32  

33 2) Running on a group of nodes in a cluster. 

34 mpicasa -hostfile user_hostfile casa .... 

35 partition(.....) 

36  

37 where user_hostfile contains the names of the nodes and the number of engines to use 

38 in each one of them. Example: 

39 pc001234a, slots=5 

40 pc001234b, slots=4 

41  

42 If CASA is started without mpicasa, it is still possible to create an MMS, but 

43 the processing will be done in sequential. 

44  

45 A multi-MS is structured to have a reference MS on the top directory and a 

46 sub-directory called SUBMSS, which contain each partitioned sub-MS. The 

47 reference MS contains links to the sub-tables of the first sub-MS. The other 

48 sub-MSs contain a copy of the sub-tables each. A multi-MS looks like this in disk. 

49  

50 ls ngc5921.mms 

51 ANTENNA FLAG_CMD POLARIZATION SPECTRAL_WINDOW table.dat 

52 DATA_DESCRIPTION HISTORY PROCESSOR STATE table.info 

53 FEED OBSERVATION SORTED_TABLE SUBMSS WEATHER 

54 FIELD POINTING SOURCE SYSCAL 

55  

56 ls ngc5921.mms/SUBMSS/ 

57 ngc5921.0000.ms/ ngc5921.0002.ms/ ngc5921.0004.ms/ ngc5921.0006.ms/ 

58 ngc5921.0001.ms/ ngc5921.0003.ms/ ngc5921.0005.ms/ 

59  

60 Inside casapy, one can use the task listpartition to list the information 

61 from a multi-MS. 

62  

63 When partition processes an MMS in parallel, each sub-MS is processed independently in an engine. 

64 The log messages of the engines are identified by the string MPIServer-#, where # gives the number 

65 of the engine running that process. When the task runs sequentially, it shows the MPIClient text 

66 in the origin of the log messages or does not show anything. 

67  

68 

69 --------- parameter descriptions --------------------------------------------- 

70 

71 vis Name of input measurement set 

72 outputvis Name of output measurement set 

73 createmms Should this create a multi-MS output 

74 separationaxis Axis to do parallelization across(scan, spw, baseline, auto) 

75 numsubms The number of SubMSs to create (auto or any number) 

76 flagbackup Create a backup of the FLAG column in the MMS. 

77 datacolumn Which data column(s) to process. 

78 field Select field using ID(s) or name(s). 

79 spw Select spectral window/channels. 

80 scan Select data by scan numbers. 

81 antenna Select data based on antenna/baseline. 

82 correlation Correlation: '' ==> all, correlation="XX,YY". 

83 timerange Select data by time range. 

84 intent Select data by scan intent. 

85 array Select (sub)array(s) by array ID number. 

86 uvrange Select data by baseline length. 

87 observation Select by observation ID(s). 

88 feed Multi-feed numbers: Not yet implemented. 

89 disableparallel Create a multi-MS in parallel. 

90 ddistart Do not change this parameter. For internal use only. 

91 taql Table query for nested selections 

92 

93 --------- examples ----------------------------------------------------------- 

94 

95  

96  

97  

98 ----- Detailed description of keyword arguments ----- 

99  

100 vis -- Name of input visibility file 

101 default: none; example: vis='ngc5921.ms' 

102  

103 outputvis -- Name of output visibility file 

104 default: none; example: outputvis='ngc5921.mms' 

105  

106 createmms -- Create a multi-MS as the output. 

107 default: True 

108 If False, it will work like the split task and create a 

109 normal MS, split according to the given data selection parameters. 

110 Note that, when this parameter is set to False, a cluster 

111 will not be used. 

112  

113 separationaxis -- Axis to do parallelization across. 

114 default: 'auto' 

115 Options: 'scan', 'spw', 'baseline', 'auto' 

116  

117 - The 'auto' option will partition per scan/spw to obtain optimal load balancing with the 

118 following criteria: 

119  

120 1. Maximize the scan/spw/field distribution across sub-MSs 

121 2. Generate sub-MSs with similar size 

122  

123 - The 'scan' or 'spw' axes will partition the MS into scan or spw. The individual sub-MSs may 

124 not be balanced with respect to the number of rows. 

125  

126 - The 'baseline' axis is mostly useful for Single-Dish data. This axis will partition the MS 

127 based on the available baselines. If the user wants only auto-correlations, use the 

128 antenna selection such as antenna='*&&&' together with this separation axis. Note that in 

129 if numsubms='auto', partition will try to create as many subMSs as the number of available 

130 servers in the cluster. If the user wants to have one subMS for each baseline, set the numsubms 

131 parameter to a number higher than the number of baselines to achieve this. 

132  

133 numsubms -- The number of sub-MSs to create. 

134 default: 'auto' 

135 Options: any integer number (example: numsubms=4) 

136  

137 The default 'auto' is to partition using the number of available servers in the cluster. 

138 If the task is unable to determine the number of running servers, or the user did not start CASA 

139 using mpicasa, numsubms will use 8 as the default. 

140  

141 Example: Launch CASA with 5 engines, where 4 of them will be used to create the MMS. The first 

142 engine is used as the MPIClient. 

143  

144 mpicasa -n 5 casa --nogui --log2term 

145 CASA> partition('uid__A1', outputvis='test.mms') 

146  

147 flagbackup -- Make a backup of the FLAG column of the output MMS. When the 

148 MMS is created, the .flagversions of the input MS are not transferred, 

149 therefore it is necessary to re-create it for the new MMS. Note 

150 that multiple backups from the input MS will not be preserved. This 

151 will create a single backup of all the flags present in the input 

152 MS at the time the MMS is created. 

153 default: True 

154  

155 datacolumn -- Which data column to use when partitioning the MS. 

156 default='all'; example: datacolumn='data' 

157 Options: 'data', 'model', 'corrected', 'all', 

158 'float_data', 'lag_data', 'float_data,data', and 

159 'lag_data,data'. 

160 N.B.: 'all' = whichever of the above that are present. 

161  

162 ---- Data selection parameters (see help par.selectdata for more detailed 

163 information) 

164  

165 field -- Select field using field id(s) or field name(s). 

166 [run listobs to obtain the list iof d's or names] 

167 default: ''=all fields If field string is a non-negative 

168 integer, it is assumed to be a field index 

169 otherwise, it is assumed to be a field name 

170 field='0~2'; field ids 0,1,2 

171 field='0,4,5~7'; field ids 0,4,5,6,7 

172 field='3C286,3C295'; fields named 3C286 and 3C295 

173 field = '3,4C*'; field id 3, all names starting with 4C 

174  

175 spw -- Select spectral window/channels 

176 default: ''=all spectral windows and channels 

177 spw='0~2,4'; spectral windows 0,1,2,4 (all channels) 

178 spw='<2'; spectral windows less than 2 (i.e. 0,1) 

179 spw='0:5~61'; spw 0, channels 5 to 61 

180 spw='0,10,3:3~45'; spw 0,10 all channels, spw 3 - chans 3 to 45. 

181 spw='0~2:2~6'; spw 0,1,2 with channels 2 through 6 in each. 

182 spw = '*:3~64' channels 3 through 64 for all sp id's 

183 spw = ' :3~64' will NOT work. 

184 spw = '*:0;60~63' channel 0 and channels 60 to 63 for all IFs 

185 ';' needed to separate different channel ranges in one spw 

186 spw='0:0~10;15~60'; spectral window 0 with channels 0-10,15-60 

187 spw='0:0~10,1:20~30,2:1;2;4'; spw 0, channels 0-10, 

188 spw 1, channels 20-30, and spw 2, channels, 1, 2 and 4 

189  

190 antenna -- Select data based on antenna/baseline 

191 default: '' (all) 

192 Non-negative integers are assumed to be antenna indices, and 

193 anything else is taken as an antenna name. 

194  

195 Examples: 

196 antenna='5&6': baseline between antenna index 5 and index 6. 

197 antenna='VA05&VA06': baseline between VLA antenna 5 and 6. 

198 antenna='5&6;7&8': baselines 5-6 and 7-8 

199 antenna='5': all baselines with antenna 5 

200 antenna='5,6,10': all baselines including antennas 5, 6, or 10 

201 antenna='5,6,10&': all baselines with *only* antennas 5, 6, or 

202 10. (cross-correlations only. Use && 

203 to include autocorrelations, and &&& 

204 to get only autocorrelations.) 

205 antenna='!ea03,ea12,ea17': all baselines except those that 

206 include EVLA antennas ea03, ea12, or 

207 ea17. 

208  

209 timerange -- Select data based on time range: 

210 default = '' (all); examples, 

211 timerange = 'YYYY/MM/DD/hh:mm:ss~YYYY/MM/DD/hh:mm:ss' 

212 Note: if YYYY/MM/DD is missing date, timerange defaults to the 

213 first day in the dataset 

214 timerange='09:14:0~09:54:0' picks 40 min on first day 

215 timerange='25:00:00~27:30:00' picks 1 hr to 3 hr 30min 

216 on next day 

217 timerange='09:44:00' data within one integration of time 

218 timerange='>10:24:00' data after this time 

219  

220 array -- (Sub)array number range 

221 default: ''=all 

222  

223 uvrange -- Select data within uvrange (default units meters) 

224 default: ''=all; example: 

225 uvrange='0~1000klambda'; uvrange from 0-1000 kilo-lambda 

226 uvrange='>4klambda';uvranges greater than 4 kilo-lambda 

227 uvrange='0~1000km'; uvrange in kilometers 

228  

229 scan -- Scan number range 

230 default: ''=all 

231  

232 observation -- Select by observation ID(s) 

233 default: ''=all 

234  

235  

236 ------ EXAMPLES ------ 

237  

238 1) Create a Multi-MS of some spws, partitioned per spw. The MS contains 16 spws. 

239 partition('uid001.ms', outpuvis='source.mms', spw='1,3~10', separationaxis='spw') 

240  

241 2) Create a Multi-MS but select only the first channels of all spws. Do not back up the FLAG 

242 column. 

243 partition('uid0001.ms', outputvis='fechans.mms', spw='*:1~10', flagbackup=False) 

244  

245 3) Create a Multi-MS using both separation axes. 

246 partition('uid0001.ms', outputvis='myuid.mms', createmms=True, separationaxis='auto') 

247  

248 4) Create a single-dish Multi-MS using the baseline axis only for the auto-correlations. 

249 partition('uid0001.ms', outputvis='myuid.mms', createmms=True, separationaxis='baseline', antenna='*&&&') 

250 

251 

252 """ 

253 

254 _info_group_ = """manipulation""" 

255 _info_desc_ = """Task to produce Multi-MSs using parallelism""" 

256 

257 def __call__( self, vis='', outputvis='', createmms=True, separationaxis='auto', numsubms='auto', flagbackup=True, datacolumn='all', field='', spw='', scan='', antenna='', correlation='', timerange='', intent='', array='', uvrange='', observation='', feed='', disableparallel=False, ddistart=int(-1), taql='' ): 

258 schema = {'vis': {'type': 'cReqPath', 'coerce': _coerce.expand_path}, 'outputvis': {'type': 'cStr', 'coerce': _coerce.to_str}, 'createmms': {'type': 'cBool'}, 'separationaxis': {'type': 'cStr', 'coerce': _coerce.to_str, 'allowed': [ 'baseline', 'AUTO', 'SPW', 'SCAN', 'auto', 'spw', 'BASELINE', 'scan' ]}, 'numsubms': {'anyof': [{'type': 'cStr', 'coerce': _coerce.to_str}, {'type': 'cInt'}]}, 'flagbackup': {'type': 'cBool'}, 'datacolumn': {'type': 'cStr', 'coerce': _coerce.to_str, 'allowed': [ 'DATA', 'model', 'corrected', 'LAG_DATA', 'lag_data', 'FLOAT_DATA,DATA', 'FLOAT_DATA', 'CORRECTED', 'lag_data,data', 'float_data', 'float_data,data', 'DATA,MODEL,CORRECTED', 'ALL', 'MODEL', 'all', 'data,model,corrected', 'LAG_DATA,DATA', 'data' ]}, 'field': {'anyof': [{'type': 'cStr', 'coerce': _coerce.to_str}, {'type': 'cStrVec', 'coerce': [_coerce.to_list,_coerce.to_strvec]}, {'type': 'cInt'}, {'type': 'cIntVec', 'coerce': [_coerce.to_list,_coerce.to_intvec]}]}, 'spw': {'anyof': [{'type': 'cStr', 'coerce': _coerce.to_str}, {'type': 'cStrVec', 'coerce': [_coerce.to_list,_coerce.to_strvec]}, {'type': 'cInt'}, {'type': 'cIntVec', 'coerce': [_coerce.to_list,_coerce.to_intvec]}]}, 'scan': {'anyof': [{'type': 'cStr', 'coerce': _coerce.to_str}, {'type': 'cStrVec', 'coerce': [_coerce.to_list,_coerce.to_strvec]}, {'type': 'cInt'}, {'type': 'cIntVec', 'coerce': [_coerce.to_list,_coerce.to_intvec]}]}, 'antenna': {'anyof': [{'type': 'cStr', 'coerce': _coerce.to_str}, {'type': 'cStrVec', 'coerce': [_coerce.to_list,_coerce.to_strvec]}, {'type': 'cInt'}, {'type': 'cIntVec', 'coerce': [_coerce.to_list,_coerce.to_intvec]}]}, 'correlation': {'anyof': [{'type': 'cStr', 'coerce': _coerce.to_str}, {'type': 'cStrVec', 'coerce': [_coerce.to_list,_coerce.to_strvec]}]}, 'timerange': {'anyof': [{'type': 'cStr', 'coerce': _coerce.to_str}, {'type': 'cStrVec', 'coerce': [_coerce.to_list,_coerce.to_strvec]}, {'type': 'cInt'}, {'type': 'cIntVec', 'coerce': [_coerce.to_list,_coerce.to_intvec]}]}, 'intent': {'anyof': [{'type': 'cStr', 'coerce': _coerce.to_str}, {'type': 'cStrVec', 'coerce': [_coerce.to_list,_coerce.to_strvec]}, {'type': 'cInt'}, {'type': 'cIntVec', 'coerce': [_coerce.to_list,_coerce.to_intvec]}]}, 'array': {'anyof': [{'type': 'cStr', 'coerce': _coerce.to_str}, {'type': 'cStrVec', 'coerce': [_coerce.to_list,_coerce.to_strvec]}, {'type': 'cInt'}, {'type': 'cIntVec', 'coerce': [_coerce.to_list,_coerce.to_intvec]}]}, 'uvrange': {'anyof': [{'type': 'cStr', 'coerce': _coerce.to_str}, {'type': 'cStrVec', 'coerce': [_coerce.to_list,_coerce.to_strvec]}, {'type': 'cInt'}, {'type': 'cIntVec', 'coerce': [_coerce.to_list,_coerce.to_intvec]}]}, 'observation': {'anyof': [{'type': 'cStr', 'coerce': _coerce.to_str}, {'type': 'cStrVec', 'coerce': [_coerce.to_list,_coerce.to_strvec]}, {'type': 'cInt'}, {'type': 'cIntVec', 'coerce': [_coerce.to_list,_coerce.to_intvec]}]}, 'feed': {'anyof': [{'type': 'cStr', 'coerce': _coerce.to_str}, {'type': 'cStrVec', 'coerce': [_coerce.to_list,_coerce.to_strvec]}, {'type': 'cInt'}, {'type': 'cIntVec', 'coerce': [_coerce.to_list,_coerce.to_intvec]}]}, 'disableparallel': {'type': 'cBool'}, 'ddistart': {'type': 'cInt'}, 'taql': {'type': 'cStr', 'coerce': _coerce.to_str}} 

259 doc = {'vis': vis, 'outputvis': outputvis, 'createmms': createmms, 'separationaxis': separationaxis, 'numsubms': numsubms, 'flagbackup': flagbackup, 'datacolumn': datacolumn, 'field': field, 'spw': spw, 'scan': scan, 'antenna': antenna, 'correlation': correlation, 'timerange': timerange, 'intent': intent, 'array': array, 'uvrange': uvrange, 'observation': observation, 'feed': feed, 'disableparallel': disableparallel, 'ddistart': ddistart, 'taql': taql} 

260 assert _pc.validate(doc,schema), create_error_string(_pc.errors) 

261 _logging_state_ = _start_log( 'partition', [ 'vis=' + repr(_pc.document['vis']), 'outputvis=' + repr(_pc.document['outputvis']), 'createmms=' + repr(_pc.document['createmms']), 'separationaxis=' + repr(_pc.document['separationaxis']), 'numsubms=' + repr(_pc.document['numsubms']), 'flagbackup=' + repr(_pc.document['flagbackup']), 'datacolumn=' + repr(_pc.document['datacolumn']), 'field=' + repr(_pc.document['field']), 'spw=' + repr(_pc.document['spw']), 'scan=' + repr(_pc.document['scan']), 'antenna=' + repr(_pc.document['antenna']), 'correlation=' + repr(_pc.document['correlation']), 'timerange=' + repr(_pc.document['timerange']), 'intent=' + repr(_pc.document['intent']), 'array=' + repr(_pc.document['array']), 'uvrange=' + repr(_pc.document['uvrange']), 'observation=' + repr(_pc.document['observation']), 'feed=' + repr(_pc.document['feed']), 'disableparallel=' + repr(_pc.document['disableparallel']), 'ddistart=' + repr(_pc.document['ddistart']), 'taql=' + repr(_pc.document['taql']) ] ) 

262 task_result = None 

263 try: 

264 task_result = _partition_t( _pc.document['vis'], _pc.document['outputvis'], _pc.document['createmms'], _pc.document['separationaxis'], _pc.document['numsubms'], _pc.document['flagbackup'], _pc.document['datacolumn'], _pc.document['field'], _pc.document['spw'], _pc.document['scan'], _pc.document['antenna'], _pc.document['correlation'], _pc.document['timerange'], _pc.document['intent'], _pc.document['array'], _pc.document['uvrange'], _pc.document['observation'], _pc.document['feed'], _pc.document['disableparallel'], _pc.document['ddistart'], _pc.document['taql'] ) 

265 except Exception as exc: 

266 _except_log('partition', exc) 

267 raise 

268 finally: 

269 task_result = _end_log( _logging_state_, 'partition', task_result ) 

270 return task_result 

271 

272partition = _partition( ) 

273