Even LIDAR won’t really help you with this one. While window glass is (usually) designed to be more or less transparent to the visible spectrum of light, in infrared it varies dramatically and can be anywhere from transparent to reflective to opaque. For this reason, any technique dealing with windows working outside the visible spectrum is liable to have some rather unpredictable behaviors.
I can think of something you can try for this, but it actually has less to do with AR than with image processing in general. Train a CNN or other classifier to recognize windows (or, perhaps more usefully, window frames), then feed that classifier occasional frames from your camera feed as you track in AR. When your classifier tells you some portion of an image it saw was a window/frame, reproject that labeling back into 3D using the tracking data for that image and your knowledge of the structure of the world. Over time, this data should accumulate into a consensus about the general location of a window in 3D space; from this consensus, it should be pretty easy to make guesses about what features in the world are and aren’t windows.
Not particularly easy, but at least off the top of my head I think it should work. Hope this helps, and best of luck!