diff options
Diffstat (limited to 'ISSUES')
-rw-r--r-- | ISSUES | 300 |
1 files changed, 79 insertions, 221 deletions
@@ -1,248 +1,106 @@ | |||
1 | 1 | ||
2 | * I'm currently confused about the statistics API bug (from C), and how shutdown/disconnect is/should be handled. | 2 | in gnunet-java-ext there is a working example of a service and a corresponding client program, that actually work ;) |
3 | * gnunet vs GNUnet, gnunet-java project name | 3 | * simple greeting server, client gives name and server returns greeting |
4 | * I'm currently trying to increase the robustness of the service APIs, | 4 | * illustrates using program/service, using the configuration, creating messages |
5 | discuss behavior in some cases | 5 | * works with os control pipe / arm |
6 | * packaging requirements | ||
7 | 6 | ||
7 | arm-4477 WARNING Configuration file `(null)' for service `greeting' not valid: option missing | ||
8 | 8 | ||
9 | ==================================================================== | 9 | even when not using the signal pipe, does arm really kill processes? |
10 | * arm does never seem to send a sigkill if process does not respond | ||
11 | * sometimes arm command hangs! | ||
10 | 12 | ||
11 | * IzPack is ~15MB, do we really want to have it in the svn repo? | 13 | $ gnunet-arm -c config/greeting.conf -k greeting -LDEBUG |
12 | No, env var. | 14 | Aug 29 19:25:22-808046 arm-api-5971 INFO Stopping service `greeting' within 60000 ms |
15 | Service `greeting' was already not running. | ||
13 | 16 | ||
14 | * the Runabout can now be an anonymous inner class | ||
15 | * implementation overhead: for public subclasses only constant overhead in the Runabout constructor | ||
16 | * for private/anonymous inner classes: 10-20% overhead, measured over 100M calls | ||
17 | * only issue left: visit methods have to be public, but this is a non issue. | ||
18 | 17 | ||
19 | * review the RequestQueue mechanism | 18 | Aug 29 18:21:19-505909 arm-4751 ERROR Failed to start service `greeting' |
19 | Service `greeting' has been started. | ||
20 | 20 | ||
21 | * Statistics: | 21 | logging with arm: what gets piped to where |
22 | * currently watches can't be canceled on the service level, only on the api level, is this intentional? | 22 | * seems like service stdout->/dev/null, service stderr->arm stderr |
23 | => fine for now | ||
24 | 23 | ||
25 | * DHT: | 24 | Construct: has gotten very complex, i'm currently trying to trace a particular bug |
26 | * getStart timeout does not really make sense (why timeout for transmission to the service, | 25 | * UPDATE: bug is gone! |
27 | but not for retrieval of answers?) | 26 | * unit tests for construct have gotten better, still not good enough |
28 | Is this just a documentation error? | 27 | * FrameSize |
29 | => fine to NOT have the timeout argument in the Java API | 28 | * recursive messages |
29 | * indirect recursion (over unions) (see core.SendMessage) works | ||
30 | * direct recursion has problems (see test/org.gnunet.construct.FrameSizeTest) | ||
30 | 31 | ||
31 | * core: | 32 | Question: Is all this too complicated? Should I invest the time to fix things as they are now intended, or should |
32 | * what happens if during a transmission the service disconnects? should we retry? | 33 | we simplify? |
33 | => "no" | ||
34 | 34 | ||
35 | * with the changes in ARM there is no way to restart gnunet with arm if some client is misbehaving | 35 | * found some problems with timeouts in client/connection. |
36 | => SVN UP | 36 | * should i write unit-tests for this timing-stuff? |
37 | * UPDATE: should be fixed now! | ||
37 | 38 | ||
38 | * I'm often getting | 39 | * program/service in general: |
39 | May 09 09:21:49-194786 util-11121 WARNING `socket' failed at connection.c:892 with error: Too many open files | 40 | * how to handle the return value of main? |
40 | => SVN UP | 41 | * java has no return value for main |
42 | * we must use System.exit(n) instead | ||
43 | * how about a (Program/Service).exit(n) that does cleanup and then calls System.exit(n)? | ||
41 | 44 | ||
45 | * configuration: what is $DEFAULTCONFIG? and CONFIG=? | ||
42 | 46 | ||
43 | ================== | 47 | mesh: |
48 | * we can't use multipe instances of org.gnunet.mesh.Mesh to test the API | ||
49 | * the local peer can't be the peer on the other end! | ||
50 | * the C-api has practically no coverage on mesh_api.c | ||
51 | * see below for a suggestion! | ||
52 | * LOCAL_TUNNEL_CREATE was used for two things, creating tunnels and being notified about incoming tunnels | ||
44 | 53 | ||
45 | #!bin/sh | 54 | * extending org.gnunet.testing so that multiple peers can be started and communicate with each other! |
46 | java -jar $GNUNET_JAVA_PREFIX/lib/gnunet.jar | 55 | * the C-testing-api allows to create multiple peers, but they don't seem to be able to communicate with each |
56 | other! | ||
57 | * => the peers should somehow exchange their hellos / use a shared directory for the hellos | ||
47 | 58 | ||
59 | testbed vs testing: is this correct? | ||
60 | * testbed for large-scale testing across many "real" nodes | ||
61 | * testing for testcases on one host | ||
48 | 62 | ||
49 | ================== | 63 | * test coverage approaching a better state, any feedback? |
50 | 64 | ||
51 | #!/bin/sh | 65 | * I think there should be some documentation in addition to the source code and the tutorial |
52 | if test -z $GNUNET_JAVA_PREFIX | 66 | * what would be the preferred format for such documentation? latex? |
53 | then | 67 | * (considering that they perhaps should end up on a website / should be browsable) |
54 | GNUNET_JAVA_PREFIX=%INSTALL_PREFIX% | 68 | * examples: |
55 | fi | 69 | * how do unions work in construct? |
56 | java -jar $GNUNET_JAVA_PREFIX/lib/gnunet.jar | 70 | * other stuff in construct |
71 | * how does annotation processing work? | ||
72 | * project layout - what goes where | ||
73 | * as there is no standard java project layout | ||
57 | 74 | ||
58 | 75 | ||
76 | rationale for putting org.gnunet.testing in the src/-tree, not in test/: | ||
77 | * when developing an extension for gnunet-java, the developers may want to access the testing functionality from | ||
78 | the jar, so it should be included there! | ||
79 | * the code itself is tested, we want coverage etc. | ||
59 | 80 | ||
60 | =========================================================== | 81 | * is there an effort to document what hostkeys files are etc.? |
61 | 82 | ||
62 | * @UNIXONLY@ PORT = 2089 in src/util/resolver.conf.on | 83 | continuous integration: using Jenkins (=Hudson, forked away from Oracle) |
63 | * what's the purpose? we also need this line to be enabled on JAVAPORT | 84 | * easier to configure than buildbot |
85 | * support for JUnit out-of-the-box | ||
86 | * support for cobertura via plugin | ||
64 | 87 | ||
65 | * something I didn't think through from the beginning: | 88 | build system: still maintaining bash scripts, trying out gradle |
66 | how should unknown message types be handled by message handlers? | 89 | * faster builds, build scripts far easier to read/write |
67 | * sometimes we want to see the message (e.g. in server), sometimes it is an error | 90 | * no integration with cobertura :( |
68 | * alternative 1: signal an error to the message handler, somehow pass the original message | 91 | * gradle has excellent Ant integration => Coverage (bash wrapper not very usable) |
69 | * alternative 2: pass a special UnknownMessage to the message handler, filter it for higher-level | 92 | * can generate project files for eclipse/intellij |
70 | APIs and signal an error. | ||
71 | 93 | ||
94 | what's next? | ||
95 | * some actual stuff built with gnunet-java? | ||
72 | 96 | ||
73 | * the recvDone is kind of clunky (see test/org.gnunet.util.ServerExample) | 97 | * general question: when should an api use per-connection receive, and per-service-receive? |
74 | 98 | ||
75 | * finally core/statistics/dht/resolver/... have unit tests! | 99 | does this make sense / do we need it: |
76 | (and coverage works again, but i can't access the cobertura via ssh yet) | 100 | * support in the scheduler for communicating asynchronously with other processes via stdin/stdout/stderr |
77 | * currently most test rely on a running gnunet, and use the default configuration | 101 | * currently only used in testing, uses blocking i/o |
78 | * alternative approach: | ||
79 | * configuration is copied from resource (may be in a jar!) to a temp file, then passed on the command | ||
80 | line to gunet-service-* | ||
81 | * useful to test behavior on disconnects | ||
82 | * problem: java sucks at managing processes, processes stay alive if we abort a test | ||
83 | * dht get: can we assume that "our" peer immediately stores the value? | ||
84 | 102 | ||
85 | * naming: when do we use destroy, when do we use disconnect, or is this arbitrary? | 103 | * stream is implemented as a library, not as a service |
86 | start-stop, create-destroy, connect-disconnect; constructor: fine, destructor: see C API | 104 | * why? |
105 | * should GNUnet-java also implement it? | ||
87 | 106 | ||
88 | * implications of using exceptions in callbacks | ||
89 | * esp. when the exception is non-fatal, i.e. the exception is caught, handled, and the program continues | ||
90 | * java exceptions have no restarts, may leave gnunet-java in inconsistent state; FINALLY! | ||
91 | |||
92 | * how to test callbacks? we do not only need to test for the right values, but also check that callback has actually | ||
93 | been called. | ||
94 | * first approach: thow a TestSuccess exception, discarded (see above) | ||
95 | * current approach: build a list of assertions, check assertions after scheduler is done. | ||
96 | * each assertion stores whether it already succeeded and a message | ||
97 | |||
98 | |||
99 | * finalziers: used to destroy object. lead to heisenbug/double disconnect. now policy: check if object | ||
100 | has been disposed of properly, otherwise log a warning. | ||
101 | * cannot guarantee cleanup anyway | ||
102 | (java behavior: run finalizers iff unreachabla during gc, may never happen, finalizers on jvm shutdown deprecated) | ||
103 | |||
104 | |||
105 | * regarding peerinfo | ||
106 | * i don't fully understand how HELLOs work. | ||
107 | |||
108 | * what are the next important services to implement? (probably mesh, peerinfo, transport) | ||
109 | |||
110 | |||
111 | * i'm currently considering to use the google guava library | ||
112 | * Apache License 2.0 | ||
113 | * con: large (1.8MB jar) | ||
114 | * but: unused class files could be stripped from the jar, e.g. using ProGuard | ||
115 | * would replace the apache commons io library | ||
116 | * contains collections used throughout gnunet | ||
117 | * bloomfilter | ||
118 | * multimap, alleviates the boilerplate code when dealing with hashmaps of lists | ||
119 | * methods to deal with "signed" primitives | ||
120 | * dealing with files (e.g. copying) | ||
121 | * redirecting i/o streams from/to files (when issuing external commands) | ||
122 | * hashing utilities | ||
123 | * tables (mapping from key pair to value), would replace nested Maps in e.g. Configurations | ||
124 | * ... | ||
125 | |||
126 | |||
127 | |||
128 | |||
129 | * long-term todos: | ||
130 | * refactor the Construct implementation, implement the "nicer" syntax | ||
131 | * refactor the getopt implementation | ||
132 | |||
133 | |||
134 | |||
135 | |||
136 | ============================================================ | ||
137 | |||
138 | * tests now run on the cobertura account :) | ||
139 | * see https://gnunet.org/cobertura/ | ||
140 | * gnunet needs to compiled with --disable-nls to work on the server | ||
141 | * cronjob added | ||
142 | * could/should we report the success of JUnit tests? | ||
143 | * (as somehow my bugs only seem to show when running on another system ;) | ||
144 | |||
145 | * gnunet-java now can now start/restart services for testing with the testing wrapper executable | ||
146 | * i duplicated the code for GNUNET_TESTING_service_run_restartable | ||
147 | * now passes the Peer to main, which can start/stop it | ||
148 | * probably also should pass the config file name (not only the handle) | ||
149 | * how about a GNUNET_TESTING_peer_get_config_path? | ||
150 | * what about windows? | ||
151 | |||
152 | * server/service: | ||
153 | * needed for testing the server: getting an unused port with java | ||
154 | * how do we test the signal pipeline? | ||
155 | * probably with runin arm in a testing peer, but then we would also have to talk to arm | ||
156 | |||
157 | |||
158 | the following notes are old: | ||
159 | * we are still not able to test "temporary destruction" | ||
160 | * because every time a new port may be generated | ||
161 | * we need a way to kill a service and run it with the same config | ||
162 | * now implemented! | ||
163 | * review if everything in Makefile.am is correct | ||
164 | * what is *_DEPENDENCIES vs *_LDADD? | ||
165 | * how do we get a handle to stdin (in a non-hacky way) that can be selected on by scheduler? | ||
166 | * c-getopt question: GNUNET_GETOPT_run returns index of first non-option. | ||
167 | so options and non-options may not be mixed? | ||
168 | * i have to manually write the config file, why isn't there a way to get the | ||
169 | file name, not just the ConfigurationHandle? | ||
170 | * could we get a confirmation that a service *really* is dead? | ||
171 | * or is this guaranteed if TESTING_service_run returns? | ||
172 | * otherwise there is no way to reliably test behavior on interrupted connection | ||
173 | * TESTING_service_run does not indicate that the service could not be run! | ||
174 | * just logs an error | ||
175 | * currently we check availability of data on stderr. | ||
176 | |||
177 | |||
178 | PEERINFO_GET vs PEERINFO_GET_ALL | ||
179 | * if we have PEERINFO_GET_ALL, why is the peer-field in ListPeerMessage(with type=PEERINFO_GET) empty? | ||
180 | |||
181 | |||
182 | * are peerinfo requests queued? | ||
183 | * i remember discussing peerinfo as a compilicated example for a general message queueing implementation. why? | ||
184 | * if we have to queue: how about a "choke/release" for the outgoing message queue? | ||
185 | * otherwise we don't know which requests belongs to which response | ||
186 | * can peerinfo return more than one record per peer? | ||
187 | |||
188 | |||
189 | -------------------------------------------------------------- | ||
190 | |||
191 | * see mantis for problem with the signal pipe | ||
192 | |||
193 | * what should the gnunet-testing-run-server tool be called, now that | ||
194 | there already is plain gnunet-testing? | ||
195 | |||
196 | * where should TESTMessage and HELLOMessage, PeerIdentity, HashCode go? | ||
197 | * and do we want to call them TESTMessage or TestMessage? | ||
198 | |||
199 | * had a bug in the IPv6 address parsing code | ||
200 | * tried to fix it / rewrite it, eventually got frustrating | ||
201 | * found out guava has an implementation of this :) | ||
202 | * also implements shortening (like ::1) | ||
203 | * by reading the code: implementing all this correctly would not have been a fun time | ||
204 | |||
205 | * TestingServer now allows the client/connection/server to be tested easily | ||
206 | * found quite some bugs during this | ||
207 | |||
208 | * thoughts about exponential backoff / the client-connection stuff in GNUnet and gnunet-java | ||
209 | * why do we wait the entire backoff period, if the connection could be available earlier? | ||
210 | |||
211 | * discuss what mesh does, what transport does | ||
212 | * i found the documentation for transport on gnunet.org | ||
213 | * the is not much information about mesh, except for the source code | ||
214 | |||
215 | --------------------------------------------------------------- | ||
216 | |||
217 | * reference count / receive_done behavior is a bit strange / confusing | ||
218 | * clients are disconnected only when refcnt==0 *and* shutdown is requested? | ||
219 | * behavior on receive done: when success=1 but refcnt=0, why don't we disconnect the client? | ||
220 | |||
221 | /** | ||
222 | * Was processing if incoming messages suspended while | ||
223 | * we were still processing data already received? | ||
224 | * This is a counter saying how often processing was | ||
225 | * suspended (once per handler invoked). | ||
226 | */ | ||
227 | |||
228 | I don't understand that comment! | ||
229 | |||
230 | |||
231 | |||
232 | * im currently confused about the different layers of GNUnet / I don't get the big picture | ||
233 | * e.g. transport's distance vector plugin vs mesh | ||
234 | * peerinfo / mesh | ||
235 | * assuming a large network, doesn't a client have to store a large amout of information? | ||
236 | |||
237 | * how to test MESH? | ||
238 | * maybe talk to Harsha about testbed? :) | ||
239 | |||
240 | * interesting things happen with JUnit | ||
241 | * failure of one test causes timeout in another | ||
242 | |||
243 | * review org.gnunet.testing | ||
244 | |||
245 | cp a x ; cp b x | ||
246 | is not the same as | ||
247 | cp b x ; cp a x | ||
248 | if x does not exist prior to copying | ||