Coverage for python/lsst/pipe/base/cmdLineTask.py: 54%

29 statements  

« prev     ^ index     » next       coverage.py v6.4.4, created at 2022-08-26 09:45 +0000

1# 

2# LSST Data Management System 

3# Copyright 2008-2015 AURA/LSST. 

4# 

5# This product includes software developed by the 

6# LSST Project (http://www.lsst.org/). 

7# 

8# This program is free software: you can redistribute it and/or modify 

9# it under the terms of the GNU General Public License as published by 

10# the Free Software Foundation, either version 3 of the License, or 

11# (at your option) any later version. 

12# 

13# This program is distributed in the hope that it will be useful, 

14# but WITHOUT ANY WARRANTY; without even the implied warranty of 

15# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 

16# GNU General Public License for more details. 

17# 

18# You should have received a copy of the LSST License Statement and 

19# the GNU General Public License along with this program. If not, 

20# see <https://www.lsstcorp.org/LegalNotices/>. 

21# 

22__all__ = ["CmdLineTask", "TaskRunner", "ButlerInitializedTaskRunner"] 

23 

24import contextlib 

25 

26from deprecated.sphinx import deprecated 

27 

28from .task import Task 

29 

30 

31@deprecated( 

32 reason="Replaced by lsst.utils.timer.profile(). Will be removed after v26.0", 

33 version="v25.0", 

34 category=FutureWarning, 

35) 

36@contextlib.contextmanager 

37def profile(filename, log=None): 

38 """Context manager for profiling with cProfile. 

39 

40 

41 Parameters 

42 ---------- 

43 filename : `str` 

44 Filename to which to write profile (profiling disabled if `None` or 

45 empty). 

46 log : `logging.Logger`, optional 

47 Log object for logging the profile operations. 

48 

49 If profiling is enabled, the context manager returns the cProfile.Profile 

50 object (otherwise it returns None), which allows additional control over 

51 profiling. You can obtain this using the "as" clause, e.g.: 

52 

53 .. code-block:: python 

54 

55 with profile(filename) as prof: 

56 runYourCodeHere() 

57 

58 The output cumulative profile can be printed with a command-line like: 

59 

60 .. code-block:: bash 

61 

62 python -c 'import pstats; \ 

63 pstats.Stats("<filename>").sort_stats("cumtime").print_stats(30)' 

64 """ 

65 if not filename: 

66 # Nothing to do 

67 yield 

68 return 

69 from cProfile import Profile 

70 

71 profile = Profile() 

72 if log is not None: 

73 log.info("Enabling cProfile profiling") 

74 profile.enable() 

75 yield profile 

76 profile.disable() 

77 profile.dump_stats(filename) 

78 if log is not None: 

79 log.info("cProfile stats written to %s", filename) 

80 

81 

82@deprecated( 

83 reason="Gen2 task runners are no longer supported. This functionality has been disabled.", 

84 version="v23.0", 

85 category=FutureWarning, 

86) 

87class TaskRunner: 

88 """Run a command-line task, using `multiprocessing` if requested. 

89 

90 Parameters 

91 ---------- 

92 TaskClass : `lsst.pipe.base.Task` subclass 

93 The class of the task to run. 

94 parsedCmd : `argparse.Namespace` 

95 The parsed command-line arguments, as returned by the task's argument 

96 parser's `~lsst.pipe.base.ArgumentParser.parse_args` method. 

97 

98 .. warning:: 

99 

100 Do not store ``parsedCmd``, as this instance is pickled (if 

101 multiprocessing) and parsedCmd may contain non-picklable elements. 

102 It certainly contains more data than we need to send to each 

103 instance of the task. 

104 doReturnResults : `bool`, optional 

105 Should run return the collected result from each invocation of the 

106 task? This is only intended for unit tests and similar use. It can 

107 easily exhaust memory (if the task returns enough data and you call it 

108 enough times) and it will fail when using multiprocessing if the 

109 returned data cannot be pickled. 

110 

111 Note that even if ``doReturnResults`` is False a struct with a single 

112 member "exitStatus" is returned, with value 0 or 1 to be returned to 

113 the unix shell. 

114 

115 Raises 

116 ------ 

117 ImportError 

118 Raised if multiprocessing is requested (and the task supports it) but 

119 the multiprocessing library cannot be imported. 

120 

121 Notes 

122 ----- 

123 Each command-line task (subclass of `lsst.pipe.base.CmdLineTask`) has a 

124 task runner. By default it is this class, but some tasks require a 

125 subclass. 

126 

127 You may use this task runner for your command-line task if your task has a 

128 ``runDataRef`` method that takes exactly one argument: a butler data 

129 reference. Otherwise you must provide a task-specific subclass of 

130 this runner for your task's ``RunnerClass`` that overrides 

131 `TaskRunner.getTargetList` and possibly 

132 `TaskRunner.__call__`. See `TaskRunner.getTargetList` for details. 

133 

134 This design matches the common pattern for command-line tasks: the 

135 ``runDataRef`` method takes a single data reference, of some suitable name. 

136 Additional arguments are rare, and if present, require a subclass of 

137 `TaskRunner` that calls these additional arguments by name. 

138 

139 Instances of this class must be picklable in order to be compatible with 

140 multiprocessing. If multiprocessing is requested 

141 (``parsedCmd.numProcesses > 1``) then `runDataRef` calls 

142 `prepareForMultiProcessing` to jettison optional non-picklable elements. 

143 If your task runner is not compatible with multiprocessing then indicate 

144 this in your task by setting class variable ``canMultiprocess=False``. 

145 

146 Due to a `python bug`__, handling a `KeyboardInterrupt` properly `requires 

147 specifying a timeout`__. This timeout (in sec) can be specified as the 

148 ``timeout`` element in the output from `~lsst.pipe.base.ArgumentParser` 

149 (the ``parsedCmd``), if available, otherwise we use `TaskRunner.TIMEOUT`. 

150 

151 By default, we disable "implicit" threading -- ie, as provided by 

152 underlying numerical libraries such as MKL or BLAS. This is designed to 

153 avoid thread contention both when a single command line task spawns 

154 multiple processes and when multiple users are running on a shared system. 

155 Users can override this behaviour by setting the 

156 ``LSST_ALLOW_IMPLICIT_THREADS`` environment variable. 

157 

158 .. __: http://bugs.python.org/issue8296 

159 .. __: http://stackoverflow.com/questions/1408356/ 

160 """ 

161 

162 pass 

163 

164 

165@deprecated( 

166 reason="Gen2 task runners are no longer supported. This functionality has been disabled.", 

167 version="v23.0", 

168 category=FutureWarning, 

169) 

170class ButlerInitializedTaskRunner(TaskRunner): 

171 r"""A `TaskRunner` for `CmdLineTask`\ s that require a ``butler`` keyword 

172 argument to be passed to their constructor. 

173 """ 

174 pass 

175 

176 

177@deprecated( 

178 reason="CmdLineTask is no longer supported. This functionality has been disabled. Use Gen3.", 

179 version="v23.0", 

180 category=FutureWarning, 

181) 

182class CmdLineTask(Task): 

183 """Base class for command-line tasks: tasks that may be executed from the 

184 command-line. 

185 

186 Notes 

187 ----- 

188 See :ref:`task-framework-overview` to learn what tasks. 

189 

190 Subclasses must specify the following class variables: 

191 

192 - ``ConfigClass``: configuration class for your task (a subclass of 

193 `lsst.pex.config.Config`, or if your task needs no configuration, then 

194 `lsst.pex.config.Config` itself). 

195 - ``_DefaultName``: default name used for this task (a `str`). 

196 

197 Subclasses may also specify the following class variables: 

198 

199 - ``RunnerClass``: a task runner class. The default is ``TaskRunner``, 

200 which works for any task with a runDataRef method that takes exactly one 

201 argument: a data reference. If your task does not meet this requirement 

202 then you must supply a variant of ``TaskRunner``; see ``TaskRunner`` 

203 for more information. 

204 - ``canMultiprocess``: the default is `True`; set `False` if your task 

205 does not support multiprocessing. 

206 

207 Subclasses must specify a method named ``runDataRef``: 

208 

209 - By default ``runDataRef`` accepts a single butler data reference, but 

210 you can specify an alternate task runner (subclass of ``TaskRunner``) as 

211 the value of class variable ``RunnerClass`` if your run method needs 

212 something else. 

213 - ``runDataRef`` is expected to return its data in a 

214 `lsst.pipe.base.Struct`. This provides safety for evolution of the task 

215 since new values may be added without harming existing code. 

216 - The data returned by ``runDataRef`` must be picklable if your task is to 

217 support multiprocessing. 

218 """ 

219 

220 pass