Coverage for python/lsst/pipe/base/cmdLineTask.py: 52%

28 statements  

« prev     ^ index     » next       coverage.py v6.4.2, created at 2022-08-01 01:21 -0700

1# 

2# LSST Data Management System 

3# Copyright 2008-2015 AURA/LSST. 

4# 

5# This product includes software developed by the 

6# LSST Project (http://www.lsst.org/). 

7# 

8# This program is free software: you can redistribute it and/or modify 

9# it under the terms of the GNU General Public License as published by 

10# the Free Software Foundation, either version 3 of the License, or 

11# (at your option) any later version. 

12# 

13# This program is distributed in the hope that it will be useful, 

14# but WITHOUT ANY WARRANTY; without even the implied warranty of 

15# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 

16# GNU General Public License for more details. 

17# 

18# You should have received a copy of the LSST License Statement and 

19# the GNU General Public License along with this program. If not, 

20# see <https://www.lsstcorp.org/LegalNotices/>. 

21# 

22__all__ = ["CmdLineTask", "TaskRunner", "ButlerInitializedTaskRunner"] 

23 

24import contextlib 

25 

26from deprecated.sphinx import deprecated 

27 

28from .task import Task 

29 

30 

31@contextlib.contextmanager 

32def profile(filename, log=None): 

33 """Context manager for profiling with cProfile. 

34 

35 

36 Parameters 

37 ---------- 

38 filename : `str` 

39 Filename to which to write profile (profiling disabled if `None` or 

40 empty). 

41 log : `logging.Logger`, optional 

42 Log object for logging the profile operations. 

43 

44 If profiling is enabled, the context manager returns the cProfile.Profile 

45 object (otherwise it returns None), which allows additional control over 

46 profiling. You can obtain this using the "as" clause, e.g.: 

47 

48 .. code-block:: python 

49 

50 with profile(filename) as prof: 

51 runYourCodeHere() 

52 

53 The output cumulative profile can be printed with a command-line like: 

54 

55 .. code-block:: bash 

56 

57 python -c 'import pstats; \ 

58 pstats.Stats("<filename>").sort_stats("cumtime").print_stats(30)' 

59 """ 

60 if not filename: 

61 # Nothing to do 

62 yield 

63 return 

64 from cProfile import Profile 

65 

66 profile = Profile() 

67 if log is not None: 

68 log.info("Enabling cProfile profiling") 

69 profile.enable() 

70 yield profile 

71 profile.disable() 

72 profile.dump_stats(filename) 

73 if log is not None: 

74 log.info("cProfile stats written to %s", filename) 

75 

76 

77@deprecated( 

78 reason="Gen2 task runners are no longer supported. This functionality has been disabled.", 

79 version="v23.0", 

80 category=FutureWarning, 

81) 

82class TaskRunner: 

83 """Run a command-line task, using `multiprocessing` if requested. 

84 

85 Parameters 

86 ---------- 

87 TaskClass : `lsst.pipe.base.Task` subclass 

88 The class of the task to run. 

89 parsedCmd : `argparse.Namespace` 

90 The parsed command-line arguments, as returned by the task's argument 

91 parser's `~lsst.pipe.base.ArgumentParser.parse_args` method. 

92 

93 .. warning:: 

94 

95 Do not store ``parsedCmd``, as this instance is pickled (if 

96 multiprocessing) and parsedCmd may contain non-picklable elements. 

97 It certainly contains more data than we need to send to each 

98 instance of the task. 

99 doReturnResults : `bool`, optional 

100 Should run return the collected result from each invocation of the 

101 task? This is only intended for unit tests and similar use. It can 

102 easily exhaust memory (if the task returns enough data and you call it 

103 enough times) and it will fail when using multiprocessing if the 

104 returned data cannot be pickled. 

105 

106 Note that even if ``doReturnResults`` is False a struct with a single 

107 member "exitStatus" is returned, with value 0 or 1 to be returned to 

108 the unix shell. 

109 

110 Raises 

111 ------ 

112 ImportError 

113 Raised if multiprocessing is requested (and the task supports it) but 

114 the multiprocessing library cannot be imported. 

115 

116 Notes 

117 ----- 

118 Each command-line task (subclass of `lsst.pipe.base.CmdLineTask`) has a 

119 task runner. By default it is this class, but some tasks require a 

120 subclass. 

121 

122 You may use this task runner for your command-line task if your task has a 

123 ``runDataRef`` method that takes exactly one argument: a butler data 

124 reference. Otherwise you must provide a task-specific subclass of 

125 this runner for your task's ``RunnerClass`` that overrides 

126 `TaskRunner.getTargetList` and possibly 

127 `TaskRunner.__call__`. See `TaskRunner.getTargetList` for details. 

128 

129 This design matches the common pattern for command-line tasks: the 

130 ``runDataRef`` method takes a single data reference, of some suitable name. 

131 Additional arguments are rare, and if present, require a subclass of 

132 `TaskRunner` that calls these additional arguments by name. 

133 

134 Instances of this class must be picklable in order to be compatible with 

135 multiprocessing. If multiprocessing is requested 

136 (``parsedCmd.numProcesses > 1``) then `runDataRef` calls 

137 `prepareForMultiProcessing` to jettison optional non-picklable elements. 

138 If your task runner is not compatible with multiprocessing then indicate 

139 this in your task by setting class variable ``canMultiprocess=False``. 

140 

141 Due to a `python bug`__, handling a `KeyboardInterrupt` properly `requires 

142 specifying a timeout`__. This timeout (in sec) can be specified as the 

143 ``timeout`` element in the output from `~lsst.pipe.base.ArgumentParser` 

144 (the ``parsedCmd``), if available, otherwise we use `TaskRunner.TIMEOUT`. 

145 

146 By default, we disable "implicit" threading -- ie, as provided by 

147 underlying numerical libraries such as MKL or BLAS. This is designed to 

148 avoid thread contention both when a single command line task spawns 

149 multiple processes and when multiple users are running on a shared system. 

150 Users can override this behaviour by setting the 

151 ``LSST_ALLOW_IMPLICIT_THREADS`` environment variable. 

152 

153 .. __: http://bugs.python.org/issue8296 

154 .. __: http://stackoverflow.com/questions/1408356/ 

155 """ 

156 

157 pass 

158 

159 

160@deprecated( 

161 reason="Gen2 task runners are no longer supported. This functionality has been disabled.", 

162 version="v23.0", 

163 category=FutureWarning, 

164) 

165class ButlerInitializedTaskRunner(TaskRunner): 

166 r"""A `TaskRunner` for `CmdLineTask`\ s that require a ``butler`` keyword 

167 argument to be passed to their constructor. 

168 """ 

169 pass 

170 

171 

172@deprecated( 

173 reason="CmdLineTask is no longer supported. This functionality has been disabled. Use Gen3.", 

174 version="v23.0", 

175 category=FutureWarning, 

176) 

177class CmdLineTask(Task): 

178 """Base class for command-line tasks: tasks that may be executed from the 

179 command-line. 

180 

181 Notes 

182 ----- 

183 See :ref:`task-framework-overview` to learn what tasks. 

184 

185 Subclasses must specify the following class variables: 

186 

187 - ``ConfigClass``: configuration class for your task (a subclass of 

188 `lsst.pex.config.Config`, or if your task needs no configuration, then 

189 `lsst.pex.config.Config` itself). 

190 - ``_DefaultName``: default name used for this task (a `str`). 

191 

192 Subclasses may also specify the following class variables: 

193 

194 - ``RunnerClass``: a task runner class. The default is ``TaskRunner``, 

195 which works for any task with a runDataRef method that takes exactly one 

196 argument: a data reference. If your task does not meet this requirement 

197 then you must supply a variant of ``TaskRunner``; see ``TaskRunner`` 

198 for more information. 

199 - ``canMultiprocess``: the default is `True`; set `False` if your task 

200 does not support multiprocessing. 

201 

202 Subclasses must specify a method named ``runDataRef``: 

203 

204 - By default ``runDataRef`` accepts a single butler data reference, but 

205 you can specify an alternate task runner (subclass of ``TaskRunner``) as 

206 the value of class variable ``RunnerClass`` if your run method needs 

207 something else. 

208 - ``runDataRef`` is expected to return its data in a 

209 `lsst.pipe.base.Struct`. This provides safety for evolution of the task 

210 since new values may be added without harming existing code. 

211 - The data returned by ``runDataRef`` must be picklable if your task is to 

212 support multiprocessing. 

213 """ 

214 

215 pass