So I’m satisfied that my ASCOM focuser driver is working as intended and while the autofocus routine feels slow (and may be delayed at startup, etc), digging through my log and comparing the timing of the focus images, the overall behavior appears to be responding at appropriate speeds. However, when sending a step from the user interface, while the GUI may update to show the start of the move, the motor moves and stops and the GUI stays frozen for a minute or so before finally updating to the current position of the motor.
This is pretty difficult to analyze, given it’s hard to tell exactly where in the log a manual move was made. However, looking closely at the autofocus run logs, I’m seeing evidence of what I would consider a big no-no for a control system like this. It would appear that SGP is attempting multi-threaded access to the focuser driver, since I’m seeing log entries from a second command (e.g. Link query) in between log entries from a first command (e.g. position query). Either that, or the ASCOM log is failing to capture a handful of events around this area.
I’ve cleaned up a chunk of my driver log around the start of a focusing sequence. Other than a couple of “impossible” log entries around some of the apparently out-of-sequence (multithreaded) calls, I’ve limited the entries to only those from specified ASCOM functions (those used by SGP). I’ve commented where things are in the process for reference. As you can see from this snippet, while some commands follow a logical order, there are things occurring in one loop that don’t occur the same way in the next, indicating that it’s not following the exact same code sequence. Again this would make sense if there were two threads making calls inside each other’s loops.
22:29:44.628 Temperature Get 24.3
22:29:45.266 Position P = 5598
22:29:46.370 Position P = 5598
22:29:47.442 Position P = 5598
22:29:48.514 Position P = 5598
22:29:48.941 TempCompAvailable Get False
22:29:48.942 TempCompAvailable Get False
22:29:48.978 IsMoving Get False
22:29:49.026 Position P = 5598
22:29:49.090 Position P = 5598
22:29:49.091 MaxStep Get 10000
22:29:49.123 IsMoving Get False
22:29:49.138 Move 6758 // Move to start plus hysteresis offset
22:29:49.986 IsMoving Get True
22:29:50.114 Direction Get -1 //Impossible query to an internal function that's only called by IsMoving or Position queries
22:29:52.949 Link Get True
22:29:55.171 Direction Get -1 //Impossible query to an internal function that's only called by IsMoving or Position queries
22:30:02.950 Link Get True
22:30:04.692 Temperature Get 24.3
22:30:05.234 Position P = 6758 - Moving
22:30:05.298 Position P = 6758 // driver detects motion stopped here...
22:30:05.314 IsMoving Get False
22:30:05.362 Position P = 6758
22:30:05.410 Position P = 6758
22:30:05.410 MaxStep Get 10000
22:30:11.282 IsMoving Get False
22:30:11.285 Target Position T = 6398 // Out of sequence log entry (part of Move command)
22:30:12.952 Link Get True
2:30:16.322 Move 6398 // Go to first position (frame 0) -- post-hysteresis move.
22:30:16.338 IsMoving Get True
22:30:16.802 Position P = 6736 - Moving
22:30:16.850 Position P = 6732 - Moving
22:30:16.850 MaxStep Get 10000
22:30:17.010 Position P = 6714 - Moving
22:30:17.010 MaxStep Get 10000
22:30:17.170 Position P = 6690 - Moving
22:30:17.170 MaxStep Get 10000
22:30:22.953 Link Get True
22:30:24.745 Temperature Get 24.3
22:30:27.362 IsMoving Get False // driver detects motion stopped here...
22:30:32.379 MaxStep Get 10000
22:30:32.401 Direction Get 0 //Impossible query to an internal function that's only called by IsMoving or Position queries
22:30:32.954 Link Get True
22:30:42.514 Position P = 6398
22:30:42.562 Position P = 6398
22:30:42.578 IsMoving Get False
22:30:42.626 Position P = 6398
22:30:42.626 MaxStep Get 10000
22:30:42.955 Link Get True
22:30:43.473 Direction Get 0
22:30:44.821 Temperature Get 24.3
22:30:48.514 Direction Get 0 //Impossible query to an internal function that's only called by IsMoving or Position queries
22:30:52.956 Link Get True
22:30:53.601 Position P = 6398
22:30:53.617 IsMoving Get False
22:30:53.665 Position P = 6398 // Assumed start of 8 second exposure (plus download time)
22:30:54.753 Position P = 6398
22:30:55.826 Position P = 6398
22:30:56.898 Position P = 6398
22:30:57.986 Position P = 6398
22:30:59.026 Position P = 6398
22:31:00.097 Position P = 6398
22:31:01.169 Position P = 6398
22:31:02.241 Position P = 6398
22:31:02.957 Link Get True
22:31:03.314 Position P = 6398
22:31:04.369 Position P = 6398 // End of 8 second exposure (storage time)
22:31:08.913 Position P = 6398
22:31:09.178 Temperature Get 24.3
22:31:09.954 IsMoving Get False
22:31:09.969 Move 6198 // Second move
22:31:10.065 IsMoving Get True
22:31:11.025 Position P = 6294 - Moving
22:31:11.105 IsMoving Get True
22:31:11.153 Position P = 6272 - Moving
22:31:11.201 Position P = 6262 - Moving
22:31:11.202 MaxStep Get 10000
22:31:11.361 Position P = 6234 - Moving
22:31:11.361 MaxStep Get 10000
22:31:11.521 Position P = 6206 - Moving
22:31:11.522 MaxStep Get 10000
22:31:11.697 Position P = 6198 // driver detects motion stopped here...
22:31:11.697 MaxStep Get 10000
22:31:11.857 Position P = 6198
22:31:11.857 MaxStep Get 10000
22:31:12.081 Position P = 6198
22:31:12.129 Position P = 6198
22:31:12.145 IsMoving Get False
22:31:12.193 Position P = 6198
22:31:12.959 Link Get True
22:31:13.249 Position P = 6198 // Assumed start of 2nd 8 second exposure (plus download time)
22:31:14.434 Position P = 6198
22:31:15.505 Position P = 6198
22:31:16.577 Position P = 6198
22:31:17.649 Position P = 6198
22:31:18.737 Position P = 6198
22:31:19.809 Position P = 6198
22:31:20.881 Position P = 6198
22:31:21.953 Position P = 6198
22:31:22.960 Link Get True
22:31:23.025 Position P = 6198 // End of 2nd 8 second exposure
22:31:27.185 Position P = 6198
22:31:28.177 IsMoving Get False
Yes, there are multiple threads accessing devices. Generally 1 to update the UI and another to monitor things when moving.
I would generally recommend putting a mutex around any operations that need to be atomic (such as a query/response if your hardware can’t support overlapping I/O). Also it’s generally a good idea to cache values in your driver if your hardware can’t handle a high amount of traffic. Although SGP is much less aggressive than it used to be in this regard.
You can see a sample mutex in action in this focus driver I wrote years ago around the only piece that communicates with the serial port (line 122):
As well as the “throttle” DoUpdate. If I were to write this again I’d certainly make some changes here…but the concept is still the same.
Yeah, I was already planning to add a mutex to address the overlapping issue. Still not sure that explains (or will fix) the reason that the UI isn’t updating/responding in a reasonable time frame though. I’d started the post specifically for that question but by the time I’d finished it the log was showing me the multi-threading issue.
So adding the mutex around the serial communication (but not each function in the interface) seems to have resolved the issue with the GUI not being responsive. However, the focus routine still seems to have a problem at the last position where it regularly sits there at 100% complete in the auto focus dialog, with the “image preview open” message and a scrolling progress bar on the main window and only the cancel button enabled. I end up having to abort and retry, which ends up leaving the focuser at some undesired position.
Yeah and other than it originally complaining because I had the temperature sensor as something optional and threw a “not connected” exception rather than a “not supported” exception if it wasn’t there, everything’s fine. Changing that passes everything (since I don’t have one of my temperature sensors here to test with, but I know that works fine too).
Error handling
Error number for “Not Implemented” is: 80040400
Error number for “Invalid Value 1” is: 80040404
Error number for “Value Not Set 1” is: 80040402
Error number for “Value Not Set 2” is: 80040403
Error messages will not be interpreted to infer state.
19:03:35.516 Driver Access Checks OK
19:03:36.166 AccessChecks OK Successfully created driver using late binding
19:03:37.950 AccessChecks OK Successfully connected using late binding
19:03:37.955 AccessChecks INFO The driver is a COM object
19:03:38.599 AccessChecks INFO Device does not expose IFocuser interface
19:03:45.313 AccessChecks INFO Device exposes IFocuserV2 interface
19:03:45.531 AccessChecks OK Successfully created driver using driver access toolkit
19:03:52.195 AccessChecks OK Successfully connected using driver access toolkit
19:03:52.312 AccessChecks OK Successfully disconnected using driver access toolkit
Conform is using ASCOM.DriverAccess.Focuser to get a Focuser object
19:03:52.438 ConformanceCheck OK Driver instance created successfully
19:03:59.148 ConformanceCheck OK Connected OK
Common Driver Methods
19:03:59.236 InterfaceVersion OK 2
19:03:59.265 Connected OK True
19:03:59.297 Description OK ASCOM Focuser Driver for OROFocus.
19:03:59.329 DriverInfo OK Orion Ranch Observatory focus motor controller. Version: 6.0
19:03:59.360 DriverVersion OK 6.0
19:03:59.393 Name OK ORO Focuser
19:03:59.423 CommandString INFO Conform cannot test the CommandString method
19:03:59.430 CommandBlind INFO Conform cannot test the CommandBlind method
19:03:59.437 CommandBool INFO Conform cannot test the CommandBool method
19:03:59.446 Action INFO Conform cannot test the Action method
19:03:59.458 SupportedActions OK Driver returned an empty action list
Properties
19:03:59.569 Absolute OK True
19:03:59.612 IsMoving OK False
19:03:59.620 MaxStep OK 10000
19:03:59.628 MaxIncrement OK 10000
19:03:59.724 Position OK 5000
19:03:59.744 StepSize OK Optional member threw a PropertyNotImplementedException exception.
19:03:59.755 TempCompAvailable OK False
19:03:59.763 TempComp Read OK False
19:03:59.776 TempComp Write OK Temperature compensation is not available and a PropertyNotImplementedException exception was generated as expected
19:03:59.787 Temperature OK Optional member threw a PropertyNotImplementedException exception.
Methods
19:03:59.852 Halt OK Focuser halted OK
19:03:59.964 Move Moving to position: 6000
19:04:06.315 Move OK Asynchronous move found
19:04:06.410 Move OK Absolute move OK
19:04:06.425 Move INFO Returning to original position: 5000
Conformance test complete
No errors, warnings or issues found: your driver passes ASCOM validation!!
I turned logging on after the last failure and haven’t had it fail again, yet. It looks like the problem may be that it’s not making the secondary hysteresis move on that last step. We’ll know more if I can get it to fail again.
Yeah, I’m referring to the backlash setting. Looking at the log from last night, the one thing I’m seeing that may be the culprit is that I’m still getting a rounding error in the scaling in my stepper code where the target and the result are off by one when the reference goes negative. That’s a common problem for truncation, but I’m not sure why it’s happening here since I thought I had that handled properly. I’ll have to look into it further, but not today!
There is currently some logic in SGP where we validate the position of the motor. Essentially We told you to go to 1000, we'll wait until you get there even if you say you're done. I’ve been contemplating removing this logic. It is likely what is causing your issue if you’re having a rounding error. But if this is removed it probably won’t happen immediately.
Thanks Jared, that’s what I suspected. It’s something of a catch 22 on the check or not to check. In this case it could get by without the check, but if there was truly a failure that was way off, it’s good to have SGP report it. In my case I can fix it since I don’t like it reporting the wrong value and I know it’s just a math problem, but there are cases (e.g. servos a la the Celestron DEC cogging problem) where the positioner DOES reach the requested target, but then afterwards drifts away so that the reported position doesn’t match the target. In such a case, having an option to ignore the error would be good. Perhaps you can configure a user defined tolerance (-1 disables)?
where the return value was an integer. The first section should have been the negative rounding, subtracting 0.5 then truncate vs. add and truncate for positive. The code in the driver that reverses this to return an integer to ASCOM has it right.
I’m still somewhat undecided if I still shouldn’t make a change to this. I designed the stepper drive to have completely scaleable units (multiplier and divider as shown above, although since it ended up as floating point math, I really don’t need both) and I somewhat arbitrarily chose the unit to be degrees on the output shaft of the stepper gear motor (not the focuser shaft). The problem is with 4096 steps per revolution a degree can’t be accurately represented on this stepper; 4096/360 = 11.3778 which rounds to 11 counts (was rounding to 10 with my bug!). Of course coming back the other way that becomes 0.9668. It’s not a big deal since we use absolute targets so the error doesn’t build, but nonetheless, it’s annoying. I could go back to just using raw steps, but it just makes all the numbers about 10x larger. Decisions, decisions!
Well, after applying the above fix and still having problems, I went through all the trouble of switching everything over to using raw stepper units so there couldn’t be any round-off error. It was only after doing that and seeing that the rate of motion didn’t change that I discovered that for some reason the installer wasn’t actually replacing the old DLL, even though it kept deleting the profile, etc. Thus I never tested any of the previous fixes. After manually deleting the DLL and re-installing, now it works. I’m not sure it’s worth going back to the degree based version at this point, although the numbers I’m dealing with now are excessive.
One thing I did encounter during the process and I’m not sure of the culprit. When SGP detects that it doesn’t reach the target it sits there hung (as far as the focuser thread) waiting for it for some time. Trying to get out of that at one point I disconnected and reconnected from the ASCOM driver. Best I could tell, that resulted in the previous keypresses I’d sent being forwarded and the focuser took off to never-never land. There was nothing I could do in SGP to stop it. I had to kill SGP and load my other test program to regain control and move it back where it belonged. That was a bit of a hairy experience!